view the rest of the comments
Selfhosted
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
-
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
-
No spam posting.
-
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
-
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
-
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
-
No trolling.
-
No low-effort posts. This is subjective and will largely be determined by the community member reports.
Resources:
- selfh.st Newsletter and index of selfhosted software and apps
- awesome-selfhosted software
- awesome-sysadmin resources
- Self-Hosted Podcast from Jupiter Broadcasting
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
I've used java Scanner objects to do this extremely efficiently with minimal memory required even with multiple parallel searches. Indexing is only necessary if you want to search for information many times and don't know what exactly the search will be. For one time searches, it's not going to be useful. Grep honestly is going to be faster and more efficient for most one time searches.
The initial indexing or searching of the files will be bottlenecked by the speed of the disk the files are on, no matter what you do. It only helps to index because you can move future searches to faster memory.
So it greatly depends on what and how often you need to search and the tradeoff is memory usage, but only for multiple searches of data you choose to index from the files in the first pass.