Scraping the entire internet is expensive.
Privacy
A place to discuss privacy and freedom in the digital world.
Privacy has become a very important issue in modern society, with companies and governments constantly abusing their power, more and more people are waking up to the importance of digital privacy.
In this community everyone is welcome to post links and discuss topics related to privacy.
Some Rules
- Posting a link to a website containing tracking isn't great, if contents of the website are behind a paywall maybe copy them into the post
- Don't promote proprietary software
- Try to keep things on topic
- If you have a question, please try searching for previous discussions, maybe it has already been answered
- Reposts are fine, but should have at least a couple of weeks in between so that the post can reach a new audience
- Be nice :)
Related communities
much thanks to @gary_host_laptop for the logo design :)
Of course theres decentralized search engine. If everyone was using YaCy and would run a crawl of everything that is missing in YaCy, nobody would need Google search engine, period!
New Edit:
To run YaCy less demonic (low resources device)
- sh startYACY.sh -d
Controlling cpu usage :
At the /PerformanceQueues_p.html page, you can limit some threads (light processes inside the JVM) maximum numbers (in the Thread Pool Settings table). When you are crawling, you can also control the Speed / PPM (Pages Per Minute) in the /Crawler_p.html page.
I didn't know about YaCy, and have now a running instance of it. Many thanks!
I have tried running YaCy as a senior node recently. I couldn't get the crawler to work for some reason. I'll have to look into it again some time.
biggest problem with yacy is that is slow java, they should had done this on C or GO/RUST now a days
Java is memory hungry but certainly not slow, especially compared to Go
Because Google, Bing (less prominently Yandex and Baidu) have an oligopoly on how most of the world searches things.
Search engine indexing takes years, and open source engines lack the infrastructure to speed up web crawling. SearX is just an instance hoster, a proxy. We have Yacy, Wiby, Mojeek, Metager and some other niche search engines. Then we have Qwant, that indexes its own and supplies results using Bing that it has not indexed yet. Then there is DuckDuckGo with its own indexing but is USA based. Then there is System1's StartPage, essentially a Google proxy with no self indexing that is hostile to Tor users.
Then there are others.
I settled on using Qwant for searching and Yandex via VPN for reverse image searching. Startpage is used extremely rarely and with VPN.
I'm looking forward to europe opensearch foundation
Is there any result coming from that foundation yet? It looks like a paper tiger to me.
seems any from them
when really the solution is decentralization
Why would it be? It will always be faster to bring updates, improve algorithms on a centralized system, while also being better in terms of UX.
We have some solutions for money with bitcoin
Solution to what? The only solution Bitcoin brings is to decentralize centralized currencies, at the cost of privacy, lack of accessibility to most people, and unbelievable energy consumption.
energy consumption
Bitcoin alone draws more electricity than the whole of Netherlands now. It's network is basically a crime against humanity at this point. And I say this as someone with a (small, tiny even) stake in it.
It's negated all gains made by renewable energy, and for what. A "currency" that is even more centralized in the hands of few people than fiat money is.
I really like Nano for this exact reason (and quick/feeless payments ofc). Bitcoin is just too powerhungry... But who knows if any cryptocurrency could ever reliably replace fiat.
But who knows if any cryptocurrency could ever reliably replace fiat.
The answer is never. Pretty much no one is able to keep a private both secret AND not lose it for a long period of time. It's not a problem when people lose their PGP keys for example, as they can always create a new one and tell people to use the new one, but with crypto, losing your key means losing your money. Imagine if you lost your bank account because your computer burned in a house fire and you never backed-up your key. If cryptocurrencies became the norm, that would happen to way to many people.
Bitcoin is digital cash - this is a quite complete analogy.
Like cash, you can keep it in your possession, so you can lose it in a fire, or else you can put it in a bank.
So you rely on a bank for your transactions, completely negating the only benefits of cryptocurrencies.
It’s negated all gains made by renewable energy
Last time there were research it was driven by 70% of renewable energy. More than most industries.
It actually increases investment into renewable energy by making it more profitable to produce renewable energy.
Exactly my point. Wasting renewable resources on cryptomining is a crime against humanity.
You dont get it.
Much of the energy used would be thrown away wasted. The mining use that energy that otherwise would just be overproduced and thrown away and convert it into coins, thus making production of renewable energy much more profitable than it would otherwise be and thus accelerating the expansion of renewable energy.
Thus it is not a crime, it actually helps motivating building more renewable energy.
Neither you or I cant decide if it is wasting or not. Its simply our opinion.
the main issue I see with bitcoin is that so many people are investing in it and not using it for its real purpose (an alternative to actual money). The fact that the fed prints money while bitcoin is finite shows that there will be less inflation issues. Although I am not an economist
It's a "currency" where the majority of currency got handed out to very few people very early on. It's has a more concentrated level of wealth than the real economy has. And the energy use is a far bigger issue than all others.
It is way more distributed than wealth in the normal fiat system in which only a few owns 50% of all the wealth.
Unlike that system you are also free not to use it.
You also have the opportunity to get other coins or versions that are cheaper.
Why would it always be better in terms of UX? Nothing is stopping a decentralized system from having good intuitive UIs.
Just the fact that you have many providers is a already a UX challenge. Either you have a "default" provider were everyone will flock (for example gmail for email), making decentralisation pointless or you force people to chose between many smaller providers, and most won't want to make such a decision before even beginning to use the service and just leave.
I don't see how that is challenging for users unless they are so lazy that they won't even change any setting on their device regardless of how annoyed they are. If the app has a directory of instances with their bios linked it can make it easier for them.
What do you mean by decentralized? Searx actually works better than Google with it's meta search capabilities and the fact you can host it anywhere, it is decentralized.
Are you referring to some sort of decentralized indexing of the internet?
YaCy is a decentralized search engine.
Only if it actually worked.
pros:
- powerful customization. intuitive and fine grained.
- decentralized. huge power but not perfected.
- crawler is simple and customizable. fairly raw.
- solr/lucene benefits included from day 1.
- website that covers the basics. and forum for help.
cons:
- frequently buggy and feels heavy
- documentation. very little documentation. many good features that are difficult to discover or use.
- consumes alot of storage. not dividable.
- setup takes time
- no foolproof simplicity. and less adoption due to setup time+difficulty.
- not recognized as important and not very popular. which is a self-fulfilling prophecy.
- not too many maintainers/contributors. slow development.
alternatives:
none
there are many alternatives for more efficient small personalized search engines with medium startup time and skill level.
a crawler can be more customizable if you script it yourself. (for yacy or whatever else)
To run YaCy less demonic (low resources device)
- sh startYACY.sh -d
Controlling cpu usage :
At the /PerformanceQueues_p.html page, you can limit some threads (light processes inside the JVM) maximum numbers (in the Thread Pool Settings table). When you are crawling, you can also control the Speed / PPM (Pages Per Minute) in the /Crawler_p.html page.
https://arborist.wip2p.eth.link/#!/goyacy
I’m new to YaCy and not aware of what may or may not be working but I just saw this mentioned as a “frontend for searching YaCy without having to remember which node to use”.
Anyway, I’m gonna give it a try.
What’s not working? Is it a matter of needing more crawlers as @[email protected] mentioned, or something else?
Wow, 2/3 times I tried searching through this site I got a dead (or otherwise inaccessible) node.
Yeah, basically the same for me. Sometimes it works ok. Sometimes I get infinite loading.
it’s a good question, I don’t know if any of the great minds of search have tried to do something decentralised and what kind of pushback it might receive from the search cartel. It seems like ipfs and peertube might be some fertile grounds for researching it further.
Because it's hard, requires A LOT of resources so that it's way easier to ask "why is there still no decentralised search engine?" than actually implement it.
Yeah I agree, its easier said than done. Web crawling requires a lot of resources