Well this may not be an issue for many, I though it worthwhile to mention here. I had been struggling with refining my database I was using for my COI gene, as in I let the pipeline run for a week and it was still not successful!
The issue was identified to be at the feature-classifier extract-reads step, my primers are probably too degenerative and just causing mayhem.
As per Qiime 2 forum:
Previously I brought up this topic, but it could not be resolved:
I am quite a newbie to Qiime2 and I seem to have run into a potential problem regarding the qiime feature-classifier extract-reads.
I am experiencing a very lengthy extract-reads, I have not experienced this problem using my other primers (18S) with other databases (PR2 and SILVA). I am running my pipeline through a high performance computer (so computing power is not a problem – currently using mem-per-cpu=10GB and cpus-per-task=16) and generally takes ~2 hours to run. The database I am using now (Midori) is double the size of the others, but I do not understand how 48 hours is not sufficient for it to run.
The memory is fine, slurm output says job just ran out of time. I have been running other jobs after this and there is no problems regarding memory.
It is just very frustrating when I am tweaking the pipeline and have to wait for the end result to see how it influences the results.
Update: it did not complete running in a week either
I have uncovered if I ignore the step feature-classifier extract-reads I can get the results within 24 hours for COI. As 18S worked with the previous method, I tested and compared results and they do differ (generally when down to genus/species, so for 90% of these cases I can see how they could potentially be compared) . So I would just like advice on if it is advisable to do this, as I see no other option?
Appreciate the help
extract-reads is definitely not necessary, and the advantages that we see for 16S may not generalize to other marker genes (as we note here). It gives a small boost in accuracy, but that is not worth the wait time you are experiencing for your COI database.
I would definitely recommend just proceeding without trimming — at worst, there will be a slight accuracy decrease at species level.
As always the guys at Qiime respond without delay and are super helpful!