It's something UCSC has been doing on their own tracks for some time (e.g. the Promotor/Enhancer Histones mark track) but was made available to 3rd parties reasonably recently through their track hub feature.
The first requirement is that you will have to have somewhere you can upload bigWig files to that UCSC can access via HTTP. You can generate bigWig files using UCSC apps bedGraphToBigWig or wigToBigWig.
Note that if you intend on displaying tracks related together they should be appropriately normalised to make comparisons make sense. If your data is already normalised (for example % methylation from Bisulfite sequencing) then this is fine. Otherwise, if you have a bedGraph file representing a pileup (number of reads covering each basepair) then something like the following should work to normalise by library size:
awk -v NUM_FRAGS=x 'BEGIN{OFS="\t"}{print $1,$2,$3,$4/(NUM_FRAGS/1000000)}' file.bedGraph > file.normalised.bedGraphWhere NUM_FRAGS=x is changed to the correct number of fragments in your dataset. At this point you can follow the instructions from UCSC to setup your first track hub. I will briefly summarise them here though.
There's a number of little files required to make it work right.. The file structure I use is something like this:
/hub.txt /genomes.txt /hg18/trackDb.txt /hg18/project1.txtHere hub.txt is the "root" of the hub and is the URL that you give to UCSC to initialise the hub and load it into the browser. The format is minimal and just describes the human readable description of your hub and the location of a "genomes" file.
/hub.txt
hub HubNameHere shortLabel Label longLabel Longer Label Here genomesFile genomes.txt email tony@mcbryan.co.ukThe genomes file is simply a list of each genome you want to be able to display data against. Valid options are any of the genomes that UCSC supports on the browser (or your mirror).
Minimally this must include one genome and link to the associated trackDB file which describes the tracks to be loaded.
/genomes.txt
genome hg18 trackDb hg18/trackDb.txtUCSC uses trackDb files internally to format the display of tracks onto the genome. Track Hubs allow us to use a subset of the trackDb functionality to load our own tracks. UCSC has long had a Custom Tracks feature but this only ever allowed a very small fraction of the features of the trackDb while the Track Hubs allow us much more.
You can, if desired, simply keep all your track definitions in the single trackDb file but this will quickly get messy. Luckily we can import additional files.
/hg18/trackDb.txt
include project1.txtWhile in project1.txt we can include only the track definitions that are relevant to that project.
/hg18/project1.txt
track SuperTrack shortLabel Label longLabel Long Label superTrack on none priority 1 track CompositeTrack container multiWig configurable on shortLabel Label longLabel Long Label visibility hide type bigWig 0 100 autoScale off aggregate overlay viewLimits 0:100 windowingFunction mean superTrack SuperTrack full showSubtrackColorOnUi on track SubTrack1 type bigWig shortLabel Label longLabel Long Label parent CompositeTrack visibility full bigDataUrl http://a.b.com/path/to/wigs/SubTrack1.bigWig color 51,102,153 track SubTrack2 type bigWig shortLabel Label longLabel Long Label parent CompositeTrack visibility full bigDataUrl http://user:pass@a.b.com/path/to/wigs/SubTrack2.bigWig color 255,102,0Therefore project1.txt is where the magic happens. We have a super track (named SuperTrack) which contains all other tracks. This gives us a single drop-down box on the UCSC browser which we can use to turn a whole projects tracks on or off at once.
This super track contains a single composite track which is of type multiWig. The multiWig is capable of overlaying multiple wig tracks on top of each other (using the aggregate option). To make this work we create regular bigWig tracks with their parent set to the composite track.
The bigDataUrl is the important one as that tells UCSC how to actually get to your data (hosted on a web server somewhere). Note that UCSC supports providing the URL along with username and password (as shown above in SubTrack2). This is compatible with usernames and passwords enforced by htaccess on Apache server.
You can fix the viewlimits on wig files (here fixed at 0 to 100 within the composite track. Note that settings on a parent track are inherited by their child tracks unless the child has a conflicting setting.
If you prefer you can have a dynamic range (may be appropriate for ChIP-Seq data etc) which will resize the view to the bounds of your data. If you use dynamic scaling you probably also want to include the alwaysZero attribute on the composite track and set it to "on" otherwise UCSC will scale from your lowest value to your highest value within the window.
You can then view the track hub by entering it on the UCSC Hub Connect page or by creating a link such as:
http://genome.ucsc.edu/cgi-bin/hgTracks?db=hg18&hubUrl=http://user:pass@a.b.com/hub/hub.txt
You should be able to use the above as a reference point to building your own Track Hubs. For more information, and details on the other features you can use in Track Hubs, see UCSC's help pages.
No comments:
Post a Comment
Note: only a member of this blog may post a comment.