DATA

A pathway-centered analysis of pig domestication and breeding in Eurasia
J. Leno-Colorado, N.J. Hudson, A. Reverter, M. Pérez-Enciso, 2017
G3: GENES, GENOMES, GENETICS Early online May 12

  • vcf163.raw.vcf.gz: vcf file containing the genotypes of 163 Asian wild, Asian domestic, European wild and European domestic pigs of 48,008,185 autosomal SNPs. SNPs shown are those with less than 30% missing data in each population group.


A deep catalog of autosomal single nucleotide variation in the pig
E Bianco, B Nevado, SE Ramos-Onsins, M Pérez-Enciso
PLosOne, 2015 19; 10(3): e0118867

  • dbSNP.vcf_def.gz: List of 47,934,075 autosomal bialellic SNP in vcf format with ancestral allele and population frequencies, resulting from the complete genome shotgun sequence data analysis of 128 domestic pigs and wild boars worldwide distributed.

  • consensus.ancestral.fa.gz: The bgzip compressed file contains the consensus ancestral genome for the pig obtained using five suid outgroups (see page 3 of Bianco et al. for details). The fasta file contains 1,758,670,670 positions for which we were able to determine the ancestral pig nucleotide. In a list of 50M SNPs, the reference allele was equal to the ancestral allele in 80% of cases, the alternative allele was equal to the ancestral in 13.5% of SNPs, and for the rest the ancestral allele could not be determined. If you use this information, please cite Bianco et al. 2015. PLosOne, 2015 19; 10(3): e0118867. In order to determine the ancestral allele of a SNP in position chr:pos, you can use "samtools faidx consensus_ancestral.fa.gz chr:pos-pos".
  • For that, you need to index the file with samtools faidx consensus_ancestral.fa.gz command.The python script vcfPosfasta.py by J. Leno-Colorado takes a vcf file as input and adds a field ANC=0/1/N to report whether the reference allele is ancestral, derived or unknown.