13
submitted 1 year ago by U2H54@lemm.ee to c/books@lemmy.ml

In 2023, several largest online libraries have simultaneously started to introduce search through the entire text of all books in their collections, going beyond just titles, descriptions, and book metadata.

Here are some notable developments in this area:

  1. Z-library now offers a full-text search across its vast collection of books (over 14 million books and 84 million articles).

  2. Nexus/STC software company provides a full-text search over its current selection of 250,000 books and approximately 2 million papers. It continues to add around 10,000 new items daily and aims to index all the books from the largest online libraries within the next year(Anna's Archive, the largest repository, hosts 21 million books and 97 million papers). Additionally, Nexus/STC was the first to develop an AI technology that operates all the data from such amount of books (the mentioned 250,000 selected items).

  3. Anna's Archive, which aggregates all items provided by Z-library, Library Genesis, Sci-Hub, and other resources, has long been developing full-text search functionality. While the release date remains unknown, it is anticipated in the near future.

  4. In August 2023, Google Books introduced a limited full-text search feature that allows users to search within the abstracts of its indexed books. Due to copyright constraints, the project's development is restricted, offering only a glimpse of its potential without comprehensive research capabilities.

  5. OpenLibrary features a "Search Inside" tool on its platform, yet its book collection is dozens of times smaller than Z-library's and lacks additional parameters for refining the searches.

Shadow libraries currently house the largest online collections of digitized and born-digital books globally. Their extensive offerings surpass those of any other platform, making comprehensive search results difficult to achieve even with full-text search functionality. Full-text search in them is a powerful research tool, that allows to get search results not available to Google or any other search engine. To keep up with the search and AI technologies, American and European companies must urgently advocate for radical changes in copyright laws.

A significant growth is anticipated in the following areas:

  • Instant access to complete collections of the world's largest libraries for individuals worldwide, facilitated by extensive digitization efforts focusing initially on non-fiction. Most important, the most known and used books have been already digitized.
  • Full-text search across all indexed books simultaneously.
  • AI systems fed with all the data from online libraries, making it operate a crucial part of knowledge available to humankind. The competencies of these new AI models will far surpass the current ones.

The number of ebooks available for free access is increasing every year. It seems impossible to combat this phenomenon as the storage space for books is negligible, leading to the creation of numerous copies and backups. It is likely that subscription-based models will emerge as a way to sustain the mentioned online services based on usage levels and that the books industry will have to accept it and adapt.

P.S. Please note that approximately 98% of links to Z-library are scams, potentially run by government entities to create confusion among users. All shadow libraries have to withstand severe pressure from the FBI and other government agencies.

Tags: ai-industry-development, all-books-access, all-books-digitization, ai-books-feedingi, libraries-full-text-search, indexing-all-books, online-libraries, open-access, search-technologies-development, shadow-libraries, the-largest-libraries

https://www.goodreads.com/author/show/37607982.Artem_Orel/blog

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here
this post was submitted on 20 Oct 2023
13 points (93.3% liked)

Books

10265 readers
104 users here now

Book reader community.

founded 4 years ago
MODERATORS