I have aquired several very large files. Specifically, CVSs of 100+ GB.
I want to search for text in these files faster than manually running grep.
To do this, I need to index the files right? Would something like Aleph be good for this? It seems like the right tool…
https://github.com/alephdata/aleph
Any other tools for doing this?
Are you looking for specific values in some field in this table, or substrings in that field?
If specific values, I’d probably import the CSV file into a database with an column indexed on the value you care about.
Many (most?) databases these days support some sort of full text search.