Content-Type: multipart/alternative; boundary="----=_NextPart_26723071.176148966082" X-Msg2Mime: True X-Msg2Mime-Client-Submit-Time: Sat, 27 Sep 2014 23:43:28 -0400 X-Msg2Mime-Message-Delivery-Time: Sat, 27 Sep 2014 23:43:34 -0400 X-Msg2Mime-Creation-Time: Tue, 17 May 2016 14:34:42 -0400 Received: from BLUPR02MB280.namprd02.prod.outlook.com (10.141.76.22) by BL2PR02MB275.namprd02.prod.outlook.com (10.141.89.148) with Microsoft SMTP Server (TLS) id 15.0.1024.12 via Mailbox Transport; Sun, 28 Sep 2014 03:43:34 +0000 Received: from BY2PR02CA0049.namprd02.prod.outlook.com (10.141.216.39) by BLUPR02MB280.namprd02.prod.outlook.com (10.141.76.22) with Microsoft SMTP Server (TLS) id 15.0.1024.12; Sun, 28 Sep 2014 03:43:33 +0000 Received: from BL2FFO11FD017.protection.gbl (2a01:111:f400:7c09::168) by BY2PR02CA0049.outlook.office365.com (2a01:111:e400:2c40::39) with Microsoft SMTP Server (TLS) id 15.0.1039.15 via Frontend Transport; Sun, 28 Sep 2014 03:43:32 +0000 Received: from vmaprod1.mail-relay.ubc.ca (142.103.117.132) by BL2FFO11FD017.mail.protection.outlook.com (10.173.161.35) with Microsoft SMTP Server (TLS) id 15.0.1029.15 via Frontend Transport; Sun, 28 Sep 2014 03:43:31 +0000 Received: from vmaprod1.mail-relay.ubc.ca (localhost.localdomain [127.0.0.1]) by localhost (Email Security Appliance) with SMTP id A52F929E592_42783E2B; Sun, 28 Sep 2014 03:43:30 +0000 (GMT) Received: from mx2.mail-relay.ubc.ca (lb_mx-out.mail-relay.ubc.ca [10.92.8.24]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mx2.mail-relay.ubc.ca", Issuer "mx2.mail-relay.ubc.ca" (not verified)) by vmaprod1.mail-relay.ubc.ca (Sophos Email Appliance) with ESMTPS id E22CF29E582_42783E1F; Sun, 28 Sep 2014 03:43:29 +0000 (GMT) Received: from smtp.mail.ubc.ca (s-itsv-hub01p.ead.ubc.ca [137.82.151.70]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (Client CN "rpc.mail.ubc.ca", Issuer "Entrust Certification Authority - L1C" (not verified)) by mx2.mail-relay.ubc.ca (Postfix) with ESMTPS id B3AF615F667; Sat, 27 Sep 2014 20:43:29 -0700 (PDT) Received: from S-ITSV-MBX05P.ead.ubc.ca ([169.254.12.253]) by S-ITSV-HUB01P.ead.ubc.ca ([137.82.151.70]) with mapi id 14.03.0181.006; Sat, 27 Sep 2014 20:43:29 -0700 From: "Rieseberg, Loren" To: "RIVIERE, Nathalie" CC: "COQUE, Marie" , Jerome Gouzy , nicolas langlade , "John M. Burke" Subject: Re: sunflower genome Thread-Topic: sunflower genome Thread-Index: AQHP2s5dkG9z6Qs1QUaBZUpQvcXyog== Date: Sun, 28 Sep 2014 03:43:28 +0000 Message-ID: <173469EE-3C0E-410A-B222-A81A0BA0B4F2@mail.ubc.ca> References: In-Reply-To: Accept-Language: en-US, en-CA Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [24.84.56.67] MIME-Version: 1.0 Return-Path: loren.rieseberg@botany.ubc.ca X-EOPAttributedMessage: 0 X-MS-Exchange-Organization-MessageDirectionality: Incoming X-Forefront-Antispam-Report: CIP:142.103.117.132;CTRY:CA;IPV:NLI;EFV:NLI;SFV:NSPM;SFS:(6009001)(428002)(374574003)(199003)(51704005)(24454002)(377454003)(189002)(23746002)(79102003)(77096002)(53416004)(44976005)(64706001)(19580405001)(19580395003)(83322001)(36756003)(6806004)(76482002)(83716003)(66066001)(50466002)(85306004)(81342003)(77982003)(46102003)(81542003)(74662003)(31966008)(74502003)(74482002)(21056001)(90102001)(87836001)(80022003)(82746002)(4396001)(221733001)(2656002)(20776003)(47776003)(86362001)(33656002)(561944003)(101416001)(120916001)(99396003)(110136001)(50986999)(10300001)(106466001)(83072002)(85852003)(107046002)(95666004)(92566001)(76176999)(92726001)(54356999)(106116001)(105586002)(104396001);DIR:INB;SFP:;SCL:1;SRVR:BLUPR02MB280;H:vmaprod1.mail-relay.ubc.ca;FPR:;MLV:sfv;PTR:vmaprod1.mail-relay.ubc.ca;A:1;MX:1;LANG:en; X-MS-Exchange-Organization-Network-Message-Id: fef78953-7559-4edc-bbb0-08d1a8e582f4 X-Microsoft-Antispam: BCL:0;PCL:0;RULEID:;UriScan:; X-MS-Exchange-Organization-AVStamp-Service: 1.0 Received-SPF: None (protection.outlook.com: botany.ubc.ca does not designate permitted sender hosts) Authentication-Results: spf=none (sender IP is 142.103.117.132) smtp.mailfrom=loren.rieseberg@botany.ubc.ca; X-MS-Exchange-Organization-SCL: 1 X-MS-Exchange-Organization-AuthSource: BL2FFO11FD017.protection.gbl X-MS-Exchange-Organization-AuthAs: Anonymous ------=_NextPart_26723071.176148966082 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Dear Nathalie, I had hoped to be able to provide a final set of pseudomolecules to you in = late spring, but the sunflower genome has not cooperated (as usual). We di= d annotate the Celera assembly and developed numerous tools linking the ass= embly (in JBrowse) to various genetic maps, QTLs, SNPs, expression data, B= last capability, and so forth. I hope these have been useful to the group a= t Biogemma. We have posted a set of =E2=80=9Cbronze pseudomolecules,=E2=80=9D which rep= resent a merger of the Allpaths and Celera assembly, and which have been or= dered and oriented as far as possible using the physical map and six geneti= c maps. This semi-final genome covers 3.64 Gb and is contained in 31,392 s= uper-scaffolds, with an N50 of 226 Kb. It has been error-corrected and the= final gap-filling step starts next week. We have been trying out various = gap-filling tools, but they were not very good at dealing with both 454 and= Illumina data, so we developed our own tool, which works better. We are u= sing SAP=E2=80=99s Cloud server for the gap-filling, so we can do many iter= ations rather quickly. We are not entirely pleased with the bronze pseudomolecules because the mer= ger created some erroneous duplicate genes and chimeric scaffolds. While t= hese problems have been largely corrected via masking or splitting, we have= developed a second merged assembly in parallel, in which much more aggress= ive filtering and repeat masking was employed prior to the merger. It also= included the new BAC-end sequences from INRA in the linkage group-specific= scaffolding step. This =E2=80=9Cgold" version of the genome has reduced m= erger artifacts and gives us a much higher N50, but it also has reduced the= total number of called base pairs in the genome by about 20%. Going forward, we planning to offer two fully annotated genomes - the bronz= e and gold version, which we think will be useful for different purposes. = The bronze genome should be ready for annotation soon, possibly by the end= of next week if the gap-filling goes well. It will be at least a month be= fore the gold version is ready. In terms of the future, we have submitted a proposal to NSF (John Burke is = the PI) and are in the process of submitting a proposal to Genome Canada fo= r funding in the $6.5 million range. The focus of both proposals is on the= genomics of abiotic stress resistance. As before, these will be collabora= tive projects between UBC, INRA, UGA, and several companies (including Biog= emma - we have been in communication with Marie about this), as well as the= USDA. Although the focus is on abiotic stress, we are intending to use so= me of the money for PacBio sequencing (or other long-read sequencing) and p= ossibly optical mapping to improve the genome. Over the next three months our hope is to complete and annotate (with Jerom= e) the bronze and gold versions of the genome. 2015 will be devoted in par= t to linking the tools/resources we have developed to these new assemblies.= We also have circa 550 WGS sequences for wild and cultivated sunflower ge= nomes that we plan to align to the new assemblies and call variants. This = will mainly be done using some alignment and variant calling approaches we = have developed with SAP that outperform current approaches. We will also de= velop a sunflower pan-genome based on all of this sequence (i.e., we will a= ssemble the unaligned genomic fraction for each genotype) and provide ances= try / pedigree information for the 290 or so cultivated genotypes we have s= equenced. Lastly, we will provide GBS-based genotypes of pre-bred material= created by the USDA and by our lab (the latter is being done in collaborat= ion with SOLTIS). I have cc:ed Jerome, Nicolas, and John, so they can add = details about their plans as well. We have not yet discussed whether the consortium will continue in its curre= nt form beyond 2016. We have been in discussions with Nicolas about having= INRA take on greater responsibility for the databasing in the next project= phase, since they have annual base support for this effort. Likewise, sim= ilar discussions are underway with the USDA about whether they could contri= bute permanent resources for databasing of sunflower genomic resources. Fi= nally, we are developing the DivSeek Initiative with the Global Crop Divers= ity Trust, which is a broader effort to provide such a database for 25 prio= rity food security crops (sunflower is the only oilseed on this list). Thu= s, it is possible that we might find alternative support for the long term = support of a website/database for sunflower genomic resources. However, if= we cannot come up with permanent public funding, then we might continue to= rely in part on contributions from interested companies. We will discuss = this in greater detail at the PAG meetings in January. Best Wishes, Loren On Sep 25, 2014, at 6:27 AM, RIVIERE, Nathalie wrote: > Hi Loren, > I hope you had a great summer, and that things are running well. As you k= now we discussed with Jerome and team a couple of weeks ago about a new rel= ease on your website, but it seems that it does not correspond to the final= one. I guess you will send an e-mail when the final version is available. > I also wanted to discuss further about what you have in mind for the futu= re. It seems that there is still work to do on the current data, for assemb= ly and annotation, which will probably be done in the coming months in the = frame of 2014 workplan. But for 2015, do you plan to maintain the current c= onsortium, and if so, what would be the workplan? Are you considering apply= ing for funding a new project? > It is important for us to understand what can to be done , and I would gr= eatly appreciate that you give me more information on that. > Regards > Nathalie > > PS: I won=E2=80=99t attend the PAG this time, but some of my colleagues w= ill certainly do so. > > _________________________________ > Nathalie Rivi=C3=A8re > Upstream Genomics Coordinator > Bioinformatics Manager > > BIOGEMMA > Route d'Ennezat > SITE DE LA GARENNE > CS 90126 > 63720 CHAPPES > > Tel : +33 (0)4-73-67-88-05 > Mobile: +33(0)6-47 24 63 58 > Fax: + 33 (0)4-73-67-88-99 > _________________________________ > > BIOGEMMA S.A.S. au capital social de 48.335.652,00 =E2=82=AC > 1, Rue Edouard Colonne - 75001 PARIS > RCS PARIS 412 514 366 > > > This message and any attachments are confidential and intended solely for= the use of the addressee(s) named above. The information contained in this= email may also be legally privileged. If you have received this email in e= rror, please notify us immediately by reply email or by fax and then delete= it. Any use, distribution or reproduction of this message is strictly proh= ibited. The integrity or authenticity of this message cannot be guaranteed.= We therefore shall not be liable for the message if altered, changed or fa= lsified. Thank you. > > Cet email et ses pi=C3=A8ces jointes sont strictement confidentiels et de= stin=C3=A9s uniquement =C3=A0 l'usage du (des) destinataire(s) sus-indiqu= =C3=A9(s). Les informations contenues dans cet email sont l=C3=A9galement p= rot=C3=A9g=C3=A9es. Si vous avez re=C3=A7u cet email par erreur, merci de n= ous le retourner imm=C3=A9diatement par courrier =C3=A9lectronique ou t=C3= =A9l=C3=A9copie avant de le supprimer. Toute utilisation ou reproduction de= cet email est strictement interdite. La v=C3=A9racit=C3=A9 et l'authentici= t=C3=A9 de cet email et de son contenu ne peuvent =C3=AAtre garanties et no= us ne pouvons =C3=AAtre tenus responsables de leur alt=C3=A9ration, modific= ation ou falsification. Merci. Loren Rieseberg Botany Department University of British Columbia 3529-6270 University Blvd Vancouver, B.C. V6T 1Z4 Phone: 604-827-4540 Fax: 604-822-6089 Loren Rieseberg Botany Department University of British Columbia 3529-6270 University Blvd Vancouver, B.C. V6T 1Z4 Phone: 604-827-4540 Fax: 604-822-6089 ------=_NextPart_26723071.176148966082 Content-Type: text/rtf; charset="utf-8" Content-Transfer-Encoding: quoted-printable {\rtf1\ansi\ansicpg1252\fromtext \fbidis \deff0{\fonttbl=0A=0D{\f0\fswiss\f= charset0 Arial;}=0A=0D{\f1\fmodern Courier New;}=0A=0D{\f2\fnil\fcharset2 S= ymbol;}=0A=0D{\f3\fmodern\fcharset0 Courier New;}}=0A=0D{\colortbl\red0\gre= en0\blue0;\red0\green0\blue255;}=0A=0D\uc1\pard\plain\deftab360 \f0\fs20 De= ar Nathalie,\par=0A=0D\par=0A=0DI had hoped to be able to provide a final s= et of pseudomolecules to you in late spring, but the sunflower genome has n= ot cooperated (as usual). We did annotate the Celera assembly and develope= d numerous tools linking the assembly (in JBrowse) to various genetic maps= , QTLs, SNPs, expression data, Blast capability, and so forth. I hope these= have been useful to the group at Biogemma.\par=0A=0D\par=0A=0DWe have post= ed a set of \'93bronze pseudomolecules,\'94 which represent a merger of the= Allpaths and Celera assembly, and which have been ordered and oriented as = far as possible using the physical map and six genetic maps. This semi-fin= al genome covers 3.64 Gb and is contained in 31,392 super-scaffolds, with a= n N50 of 226 Kb. It has been error-corrected and the final gap-filling ste= p starts next week. We have been trying out various gap-filling tools, but= they were not very good at dealing with both 454 and Illumina data, so we = developed our own tool, which works better. We are using SAP\'92s Cloud se= rver for the gap-filling, so we can do many iterations rather quickly.\par= =0A=0D\par=0A=0DWe are not entirely pleased with the bronze pseudomolecules= because the merger created some erroneous duplicate genes and chimeric sca= ffolds. While these problems have been largely corrected via masking or sp= litting, we have developed a second merged assembly in parallel, in which m= uch more aggressive filtering and repeat masking was employed prior to the = merger. It also included the new BAC-end sequences from INRA in the linkag= e group-specific scaffolding step. This \'93gold" version of the genome ha= s reduced merger artifacts and gives us a much higher N50, but it also has = reduced the total number of called base pairs in the genome by about 20%. = \par=0A=0D\par=0A=0DGoing forward, we planning to offer two fully annotated= genomes - the bronze and gold version, which we think will be useful for d= ifferent purposes. The bronze genome should be ready for annotation soon,= possibly by the end of next week if the gap-filling goes well. It will be= at least a month before the gold version is ready.\par=0A=0D\par=0A=0DIn t= erms of the future, we have submitted a proposal to NSF (John Burke is the = PI) and are in the process of submitting a proposal to Genome Canada for fu= nding in the $6.5 million range. The focus of both proposals is on the gen= omics of abiotic stress resistance. As before, these will be collaborative= projects between UBC, INRA, UGA, and several companies (including Biogemma= - we have been in communication with Marie about this), as well as the USD= A. Although the focus is on abiotic stress, we are intending to use some o= f the money for PacBio sequencing (or other long-read sequencing) and possi= bly optical mapping to improve the genome.\par=0A=0D\par=0A=0DOver the next= three months our hope is to complete and annotate (with Jerome) the bronze= and gold versions of the genome. 2015 will be devoted in part to linking = the tools/resources we have developed to these new assemblies. We also hav= e circa 550 WGS sequences for wild and cultivated sunflower genomes that we= plan to align to the new assemblies and call variants. This will mainly b= e done using some alignment and variant calling approaches we have develope= d with SAP that outperform current approaches. We will also develop a sunfl= ower pan-genome based on all of this sequence (i.e., we will assemble the u= naligned genomic fraction for each genotype) and provide ancestry / pedigre= e information for the 290 or so cultivated genotypes we have sequenced. La= stly, we will provide GBS-based genotypes of pre-bred material created by t= he USDA and by our lab (the latter is being done in collaboration with SOLT= IS). I have cc:ed Jerome, Nicolas, and John, so they can add details about= their plans as well.\par=0A=0D\par=0A=0DWe have not yet discussed whether = the consortium will continue in its current form beyond 2016. We have been= in discussions with Nicolas about having INRA take on greater responsibili= ty for the databasing in the next project phase, since they have annual bas= e support for this effort. Likewise, similar discussions are underway with= the USDA about whether they could contribute permanent resources for datab= asing of sunflower genomic resources. Finally, we are developing the DivSe= ek Initiative with the Global Crop Diversity Trust, which is a broader effo= rt to provide such a database for 25 priority food security crops (sunflowe= r is the only oilseed on this list). Thus, it is possible that we might fi= nd alternative support for the long term support of a website/database for = sunflower genomic resources. However, if we cannot come up with permanent = public funding, then we might continue to rely in part on contributions fro= m interested companies. We will discuss this in greater detail at the PAG = meetings in January.\par=0A=0D\par=0A=0DBest Wishes, Loren\par=0A=0D\par=0A= =0D\par=0A=0D\par=0A=0D\par=0A=0D\par=0A=0D\par=0A=0D\par=0A=0D\par=0A=0D\p= ar=0A=0D\par=0A=0DOn Sep 25, 2014, at 6:27 AM, RIVIERE, Nathalie wrote:\par=0A=0D\par=0A=0D> Hi Loren,\par=0A=0D> I ho= pe you had a great summer, and that things are running well. As you know we= discussed with Jerome and team a couple of weeks ago about a new release o= n your website, but it seems that it does not correspond to the final one. = I guess you will send an e-mail when the final version is available.\par=0A= =0D> I also wanted to discuss further about what you have in mind for the f= uture. It seems that there is still work to do on the current data, for ass= embly and annotation, which will probably be done in the coming months in t= he frame of 2014 workplan. But for 2015, do you plan to maintain the curren= t consortium, and if so, what would be the workplan? Are you considering ap= plying for funding a new project?\par=0A=0D> It is important for us to unde= rstand what can to be done , and I would greatly appreciate that you give m= e more information on that.\par=0A=0D> Regards\par=0A=0D> Nathalie\par=0A= =0D> \par=0A=0D> PS: I won\'92t attend the PAG this time, but some of my co= lleagues will certainly do so.\par=0A=0D> \par=0A=0D> _____________________= ____________\par=0A=0D> Nathalie Rivi\'e8re\par=0A=0D> Upstream Genomics Co= ordinator\par=0A=0D> Bioinformatics Manager\par=0A=0D> \par=0A=0D> BIOGEMMA= \par=0A=0D> Route d'Ennezat\par=0A=0D> SITE DE LA GARENNE \par=0A=0D> CS 90= 126\par=0A=0D> 63720 CHAPPES\par=0A=0D> \par=0A=0D> Tel : +33 (0)4-73-67-8= 8-05\par=0A=0D> Mobile: +33(0)6-47 24 63 58\par=0A=0D> Fax: + 33 (0)4-73-67= -88-99\par=0A=0D> _________________________________\par=0A=0D> \par=0A=0D> = BIOGEMMA S.A.S. au capital social de 48.335.652,00 \'80\par=0A=0D> 1, Rue E= douard Colonne - 75001 PARIS\par=0A=0D> RCS PARIS 412 514 366\par=0A=0D> \p= ar=0A=0D> \par=0A=0D> This message and any attachments are confidential and= intended solely for the use of the addressee(s) named above. The informati= on contained in this email may also be legally privileged. If you have rece= ived this email in error, please notify us immediately by reply email or by= fax and then delete it. Any use, distribution or reproduction of this mess= age is strictly prohibited. The integrity or authenticity of this message c= annot be guaranteed. We therefore shall not be liable for the message if al= tered, changed or falsified. Thank you.\par=0A=0D> \par=0A=0D> Cet email et= ses pi\'e8ces jointes sont strictement confidentiels et destin\'e9s unique= ment \'e0 l'usage du (des) destinataire(s) sus-indiqu\'e9(s). Les informati= ons contenues dans cet email sont l\'e9galement prot\'e9g\'e9es. Si vous av= ez re\'e7u cet email par erreur, merci de nous le retourner imm\'e9diatemen= t par courrier \'e9lectronique ou t\'e9l\'e9copie avant de le supprimer. To= ute utilisation ou reproduction de cet email est strictement interdite. La = v\'e9racit\'e9 et l'authenticit\'e9 de cet email et de son contenu ne peuve= nt \'eatre garanties et nous ne pouvons \'eatre tenus responsables de leur = alt\'e9ration, modification ou falsification. Merci.\par=0A=0D\par=0A=0DLor= en Rieseberg\par=0A=0DBotany Department\par=0A=0DUniversity of British Colu= mbia\par=0A=0D3529-6270 University Blvd\par=0A=0DVancouver, B.C. V6T 1Z4\pa= r=0A=0DPhone: 604-827-4540\par=0A=0DFax: 604-822-6089\par=0A=0D\par=0A=0DLo= ren Rieseberg\par=0A=0DBotany Department\par=0A=0DUniversity of British Col= umbia\par=0A=0D3529-6270 University Blvd\par=0A=0DVancouver, B.C. V6T 1Z4\p= ar=0A=0DPhone: 604-827-4540\par=0A=0DFax: 604-822-6089\par=0A=0D\par=0A=0D} ------=_NextPart_26723071.176148966082--