Acknowledgements We would like to gratefully acknowledge the help of Dr. rer. nat. Diego Ria?o-Pach��n at Centro Nacional de Pesquisa em Energia e Materiais for his instructions in data analysis and the Group of Computational and Evolutionary Biology at University http://www.selleckchem.com/products/INCB18424.html of Los Andes for providing us access to the computing grid cluster. This work was performed under the auspices of the Grant (1204-452-21129) of the Instituto Colombiano para el fomento de la Investigaci��n Francisco Jos�� de Caldas, Colciencias and by the Centro de Investigaciones Microbiol��gicas – CIMIC laboratory.
A representative genomic 16S rRNA gene sequence of L. arenae DSM 19593T was compared using NCBI BLAST [4,5] under default settings (e.g.
, considering only the high-scoring segment pairs (HSPs) from the best 250 hits) with the most recent release of the Greengenes database [6] and the relative frequencies of taxa and keywords (reduced to their stem [7]) were determined, weighted by BLAST scores. The most frequently occurring genera were Jannaschia (38.1%), Thalassobacter (15.4%), Octadecabacter (11.7%), Roseovarius (10.7%) and Roseobacter (10.2%) (28 hits in total). Regarding the three hits to sequences from other members of the genus, the average identity within HSPs was 96.0%, whereas the average coverage by HSPs was 98.7%. Among all other species, the one yielding the highest score was ‘Octadecabacter orientus’ (“type”:”entrez-nucleotide”,”attrs”:”text”:”DQ167247″,”term_id”:”74136946″,”term_text”:”DQ167247″DQ167247), which corresponded to an identity of 99.2% and an HSP coverage of 99.
6%. (Note that the Greengenes database uses the INSDC (= EMBL/NCBI/DDBJ) annotation, which is not an authoritative source for nomenclature or classification). The highest-scoring environmental sequence was “type”:”entrez-nucleotide”,”attrs”:”text”:”FJ664800″,”term_id”:”224591289″,”term_text”:”FJ664800″FJ664800 (Greengenes short name ‘Quantitative dynamics cells plankton-fed microbial fuel cell clone plankton D11′), which showed an identity of 97.0% and an HSP coverage of 99.6%. The most frequently occurring keywords within the labels of all environmental samples which yielded hits were ‘lake’ (9.9%), ‘tin’ (9.8%), ‘xiaochaidan’ (9.4%), ‘microbi’ (2.6%) and ‘sea’ (2.5%) (222 hits in total). Environmental samples which yielded hits of a higher score than the highest scoring species were not found.
Figure 1 shows the phylogenetic neighborhood of L. arenae in a 16S rRNA sequence based tree. The sequence of the single 16S rRNA gene in the genome does not differ from the previously published 16S rDNA sequence (“type”:”entrez-nucleotide”,”attrs”:”text”:”EU342372″,”term_id”:”164564395″,”term_text”:”EU342372″EU342372). Figure 1 Phylogenetic tree highlighting the position Drug_discovery of L. arenae relative to the type strains of the type species of the other genera within the family Rhodobacteraceae.