I have aquired several very large files. Specifically, CVSs of 100+ GB.

I want to search for text in these files faster than manually running grep.

To do this, I need to index the files right? Would something like Aleph be good for this? It seems like the right tool…

https://github.com/alephdata/aleph

Any other tools for doing this?

  • tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    10
    ·
    edit-2
    2 months ago

    Are you looking for specific values in some field in this table, or substrings in that field?

    If specific values, I’d probably import the CSV file into a database with an column indexed on the value you care about.