Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed in construction a new database #89

Open
juntaowang opened this issue Jun 26, 2023 · 0 comments
Open

Failed in construction a new database #89

juntaowang opened this issue Jun 26, 2023 · 0 comments

Comments

@juntaowang
Copy link

Hi,

I'm using CAT prepare to make a new database. The header of the fasta file and the acc2tax like below. The command run smothly, however, it showed that "headers (100.0%) do not have a taxid assigned". My question, is is there any specific requirement for the sequence ID format? Thanks!

Fasta file header

IMGVR_UViG_2049941000_000021.8890010
MSNRTIEQNLTLLSSTKRDIRNAINIKGGSVSSATPFADYATAINNLPSGGGGEDLDWSLIGYSARPSSIDDAYAYSRQI
YVDWDNTETDMSERYYQNKQIKYFPLVDTSNVTNMRSMFMDSALEYIPLLDTSNVTDMSNMLYGTKLTTIPQFNTSACTN
MRYTFVDTNITSIPLLDTRNVETFQGTFERTKITTIPQLNTSAATNMSEMFWDCTNLTSIPLLDTSNAENMNYMFVNCTS
LTDIPQLNTSACTSMYMMFNGCTGLTGFSALSGYDFTNVENCADMFSDCHNIPLTTLPVINTVNCTDFGSMLRFNSSALT
RVEGIYLNSWHGGQIFGDWNGDPDSQQNHPNLTYVMLYNLGMSEYASGFDLACLMDWGDGGTANHQSLIDSLYTNAFDRA
TAGYETKRIYLHPYTYARLSQQEIDNIESKGYEVVDINNQ
IMGVR_UViG_2049941000_000021.8890220
MNERNKKNLYSQLEDKYLKMLLGRAFKKYDDYTIATTIQSIVRLFPMTRRFDDKTIKILKDNCGNQHFRKMLNDRLK
IMGVR_UViG_2049941000_000021.8890360
MASSVDLSPVMRAMEKDFLDFVHLVMESDAGINRKVGVNTLARSDLYQTAWTLAQEGGGSLVVNIMLNDYLYYVEHGRRR
GAKMPPVEPIIRWARKNGIPTDNSTIFLIRRAIVRDGIQGRPIMEQVLGLIDEGMLEDNGYLDMVFDQIVKLVDEFFNR
IMGVR_UViG_2049941000_000021.8890500
MSHAREFIDYIGGHYPEIQRMLLAYCRNRGEEFSDDILHQTFLNCYETIDRKGEMSDPTPKGFRDYLFKAFKFNIMREKQ
YARNKNRASVEDFVTAWESFLESCPSADEKVQDDMRKDFGAWYIATRAEEAATEGAIDIESFHLWRIKTFMGFTYQQLGE
VTGAERVREKVLSVKHWLQENISRQDITDAFNERYGLGG
IMGVR_UViG_2049941000_000021.8889990
MKYLNKFDTKSDYQQSASTLEIPNVAYITATTEVIYNATQPKAYVVFADQAVGRKCVLLYSSDGVGCTEEDLAAVTYLNP
NDWKGTQSNPNPYTSFDELRYFTGITTIDRECFGDCPSLTSVTIPESVTEIAVFAFYDSPLESITFMSATAPTFGRDVFH
DQAASGNITVPANGEGNYYDLAVSLGNGWTINGHAPVRYVAFEDPLVASKCATLYGDGTGCTEADLAAVNMINASDWSGT
PITSFNELRYFTGVLIISDYAFSGCTDLTGVTFSDTYLYGIGFHTFEGCTALTSVNFGTHVAVIYGHAFNGCSALERIII
PESVQQINGDAFSNCSSLSTIIFKSVTPPPSFGVSGPVFYNISSTGTLMVPSGGTSNYQSIAQSLGAGWTVEEFITFADQ
TVGQKCATLYGNGDVCTEADLGAVTSLNANDWRSTNQQYPTPYTSFDELRYFTAITSIPQECFGYSTSLTSVTIPSAVTE
IGKWALYESPLSSITFMSTTPPTLGEEVFHDQAASGNITVPVGSESNYFSLAQSLGSGWTVNGQTPS
IMGVR_UViG_2049941000_000021.8890200
MKPVKDKDGYLVVHLSKNGKRKTHKVHRLVAEAFIPNDDPERKTQINHLSEFEKTNNRVENLCWMSPKENTNWGTRNERI
AKKMKNDKRSKIVHQFTLDGQFIKEYPSTHEVRRQTGFGRSHISECCSGKYKTAYGYIWRYK

acc2tax file header
accession.version taxid
IMGVR_UViG_2504643025_000006.2504643025.2504727034 2732094
IMGVR_UViG_2504643025_000001.2504643025.2504727036 2732094
IMGVR_UViG_2504643025_000006.2504643025.2504727036 2732094
IMGVR_UViG_2504643025_000001.2504643025.2504727039 2732094
IMGVR_UViG_2504643025_000001.2504643025.2504727040 2732094
IMGVR_UViG_2504643025_000006.2504643025.2504727042 2732094
IMGVR_UViG_2504756089_000005.2504756089.2505108989 10860
IMGVR_UViG_2504756089_000001.2504756089.2505109062 10860
IMGVR_UViG_2504756089_000001.2504756089.2505109063 10860
IMGVR_UViG_2504756089_000001.2504756089.2505109064 10860
IMGVR_UViG_2504756089_000005.2504756089.2505108986 10860
IMGVR_UViG_2504756089_000001.2504756089.2505109065 10860
IMGVR_UViG_2504756089_000001.2504756089.2505109066 10860
IMGVR_UViG_2504756089_000005.2504756089.2505108974 10860

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant