[-] marsara9@lemmy.world 44 points 1 year ago

Thanks for the shout-out.

But FYI I've run into some bugs that's preventing new content from being indexed. So you won't see anything new (from about a week ago) until I can find a new method to fetch new posts.

[-] marsara9@lemmy.world 48 points 1 year ago* (last edited 1 year ago)

Playing devil's advocate for a bit... So these are just cross-posts. Which existed even on Reddit. ...I assume they weren't handled in any way in Sync or Reddit?

But let's say this is fixed... What to do about the multiple comments threads? How would you reconcile them with each other? Especially since the user can choose different ways to sort the comments as well. Would all of this logic normally handled by the Lemmy back-end now need to run on your phone? Also how do you choose which post / instance to actually display and which ones to hide?

Btw, I'm not trying to dismiss the idea. Just want to call out some of the technical problems that might come up trying to implement such a feature. As well as ask questions to try and determine exactly how such a feature is expected to work.

[-] marsara9@lemmy.world 9 points 1 year ago

Not necessarily. I have several servers behind Cloudflare for free. I'm just limited on analytics, some advanced firewall settings, advanced cache management and maybe a few other features that I don't use. But the basic service is free.

https://www.cloudflare.com/plans/free/

[-] marsara9@lemmy.world 29 points 1 year ago

With ActivityPub all of the primary ids contain the domain of the hosting server. So if you lose your domain none of the other instances know that you're the authority on those communities, posts, comments or users. So essentially federation breaks with all of the old data.

[-] marsara9@lemmy.world 12 points 1 year ago

Unless you have an account there's no easy way to get access to the content on the page. Once you have an account there's technically nothing stopping you from just saving the HTML file to your computer.

Something else you can try though, assuming you don't have an account, is to just turn off JavaScript. If the site lets you partially load the content and then asks you to create an account to read more, they usually just block the content by having JavaScript add an opaque overlay. With JavaScript disabled, obviously it's not there to add the overlay and you're able to keep reading.

[-] marsara9@lemmy.world 12 points 1 year ago* (last edited 1 year ago)

Check out my post history.

But https://www.search-lemmy.com. It has a few bugs but it should work for you. Especially if you set your home instance to something large like Lemmy.world.

Edit: if you want to help contribute: https://www.github.com/marsara9/lemmy-search

154

I keep see people complaining about not being able to find active communities that match their interests. So I've added a new feature to https://www.search-lemmy.com/ that allows you to search posts for a particular topic and then it tells you which communities have the most posts matching your search query.

And assuming that you've set your home instance correctly, those links will even open up in your home instance, so that you can subscribe directly to them.

For example, if you search for 'linux' (https://www.search-lemmy.com/find-communities/results?query=linux&page=1) it gives you a link to each community, tells you which instance it's on and how many matches it found for your query.

All of the same filters that you can use on the normal search can be used here as well. So if you just want to find the best community that mentions linux on lemmy.world (https://www.search-lemmy.com/find-communities/results?query=linux+instance%3Alemmy.world&page=1), you can filter by just that instance. Click on the Search Tips button to see a list of all of the available filters.

P.S. I'm aware of https://lemmyverse.net/ etc... and while those are great as well, this allows you to search to see what people are actually talking about on the various communities.

Again, if you have any feature requests or find any bugs, PLEASE reach out or ideally go to my github (https://github.com/marsara9/lemmy-search) and log a bug there.

251

A couple days ago I updated https://search-lemmy.com/ to 0.4.0.

New features, that several people were asking for:

  • The UI has been overhauled and it should be much easier to find your home instance now.
  • Search itself has been overhauled. Increase search performance significantly. I also automatically search for related terms as well. You may now see fewer search results, but ideally they should be more relevant. You can also now include basic syntax like:
    • quotes: "some terms that must be together"
    • negative terms: cat -dog (shows posts about cats that don't mention dogs)
    • either or: cat OR dog (shows posts about either cats or dogs). The default search behavior is now an implicit AND, but order doesn't matter.
  • I've added several new filters that you can use including:
    • !safeoff -- Disables safe search allowing NSFW posts to appear in the search results (NSFW is now hidden by default)
    • since:YYYY-MM-DD -- shows only posts that have occurred since the specified date
    • until:YYYY-MM-DD -- same as above but in reverse. It will only posts up to the given date.
  • I've removed the preferred-instance query parameter from the results URL so it should be easier to share links to search results now.
  • The date the post was created or last updated is now displayed in the search results.

Bug Fixes:

  • Site performance should now be stable. Fixed a bug related to the database pool that was causing the site to hang.
  • Fixed a bug that would cause broken links.
  • Fixed various bugs with the crawler causing posts to be missed.

Known Issues:

  • If you set your home-instance to a fairly small instance, the number of search results is also relatively small. Once (https://github.com/LemmyNet/lemmy/issues/3259) is resolved. I should be able to show links regardless of what your home instance is set to, allowing you to search the entire Fediverse.
  • Currently searching only looks at the post title and body. Comments aren't indexed either. This also is dependent on the above issue on Lemmy itself.

Finally some things to note:

I've started to refactor the code to abstract away Lemmy from the actual search engine. As I now start to prepare to search other Fediverse instances like Kbin, and maybe even Mastodon, etc...

[-] marsara9@lemmy.world 52 points 1 year ago

IMHO federation doesn't bring any real benefits to git and introduces a lot of risks.

The git protocol, if you will, already allows developers to backup and move their repositories as needed. And the primary concern with source control is having a stable and secure place to host it. GitHub already provides that, free of charge.

Introducing federation, how do you control who can and cannot make changes to your codebase? How do you ensure you maintain access if a server goes down?

So while it's nice that you can self host and federate git with GitLab, what value does that provide over the status quo? And how do those benefits outweigh the risks outlined above?

[-] marsara9@lemmy.world 13 points 1 year ago

https://www.search-lemmy.com/

https://www.github.com/marsara9/lemmy-search

It only works for Lemmy, for now. And please feel free to post any feature requests or bugs to GitHub as it's still fairly new.

You can also check my comment/post history for more details.

[-] marsara9@lemmy.world 21 points 1 year ago* (last edited 1 year ago)

https://www.search-lemmy.com

http://www.github.com/marsara9/lemmy-search

Just add community:[!nostupidquestions@lemmy.world](/c/nostupidquestions@lemmy.world) at the end of your query.

81
submitted 1 year ago* (last edited 1 year ago) by marsara9@lemmy.world to c/fediverse@lemmy.world

I shared bits and pieces of this before, but it's officially up and running now: https://www.search-lemmy.com/

This is an enhanced search engine for Lemmy. With a few primary goals:

  • You can choose a preferred instance. After choosing what your primary instance is, and performing a search ALL links will open in that instance.
  • This aims to be a replacement for using site:reddit.com in Google, but just for the fediverse.
  • You can filter the search results by:
    • Instance -- This will filter the results to only show communities that belong to a particular instance. Just type something like instance:lemmy.wrold or instance:https://lemmy.world/. This is separate from your preferred instance, such that you can search for posts on lemmy.world while still opening them on lemmy.ml.
    • Community -- You can refine the search by a specific community. You use the same syntax that you'd use here community:[!fediverse@lemmy.world](/c/fediverse@lemmy.world).
    • Author -- Similar to the above you can also filter by a specific author such as: author:@marsara9@lemmy.world.
  • The entire thing is open-source. You can view the code and even host your own instance... See more details here: https://github.com/marsara9/lemmy-search.

NOTE: This only supports Lemmy instances for now. Other fediverse type instances may be in the future depending on how this works out.

I've been working on this over just the last few weeks, so it hasn't had a chance to crawl much of the fediverse yet. For now it only supports lemmy.world and lemmy.ml but other preferred-instances will come online as time goes by.

If anyone finds any bugs, and I'm sure you will, or if anyone has any suggestions PLEASE raise an issue on GitHub for me to track. Lastly, if anyone wants to help contribute please feel free to reach out.

NOTE TO SERVER ADMINS: You can prevent your site from being crawled by adding lemmy-search to your robots.txt for the user-agent.

[-] marsara9@lemmy.world 11 points 1 year ago

You'll be able to get an initial set of data from federation but you won't get any data beyond that.

Federation works via "pushes", but since your instance would be behind a VPN the other instances in the federation wouldn't be able to see it to push content updates.

[-] marsara9@lemmy.world 91 points 1 year ago* (last edited 1 year ago)

I'm working on a specialized search engine just for the fediverse. https://github.com/marsara9/lemmy-search

If anyone wants to help out, feel free to reach out, but I hope to have something ready to release soon.

The idea with my version is that it'll search as much of Lemmy / the fediverse as it can and you can select the preferred instance that you want to open any link with.

[-] marsara9@lemmy.world 41 points 1 year ago

So I've been working on a solution for this.

As I see it Google and others are going to have a hard if not impossible time to incorporate the fediverse, and the fact that the same content can exist on multiple servers.

So I'm working on a search engine specifically build, for Lemmy at least. Where it'll take you to whatever your preferred instance is when tapping on a search result.

I hope to have a MVP up and running in a few more days.

view more: next ›

marsara9

joined 1 year ago