this post was submitted on 07 Sep 2021
27 points (100.0% liked)

Open Source

32381 readers
782 users here now

All about open source! Feel free to ask questions, and share news, and interesting stuff!

Useful Links

Rules

Related Communities

Community icon from opensource.org, but we are not affiliated with them.

founded 5 years ago
MODERATORS
 

Hi there, looking for an alternative to news.google.com that just simply isn't a Google product. I know it's not open source per say, but just curious.

all 35 comments
sorted by: hot top controversial new old
[–] [email protected] 9 points 3 years ago* (last edited 3 years ago) (2 children)

What about creating your newsfeed by getting some rss feed in a reader ? You could then get updated by your main sources

[–] [email protected] 8 points 3 years ago* (last edited 3 years ago)

yeah rss is great, esp if you can sync it with devices, like nextcloud reader or tt-rss

also if a site doesn't have an obvious rss feed, open the source with f12 and search for rss or atom

[–] [email protected] 4 points 3 years ago

Happy to see RSS coming back. I remember when twitter got popular that people were worried it would replace RSS with a centralized service.

[–] [email protected] 6 points 3 years ago (2 children)

On my phone i use feeder ( android, not sure if it is on ios ). On my computer I use newspaper3k ( https://newspaper.readthedocs.io/en/latest/ ) -- I built out some additional summary tools and nltk tools that allow me to find article on similar topic from sources with different bias + some named entity extraction that easily joins into dbpedia. I intend to contribute the additional features I've added but haven't done so yet as the code is rough.

[–] [email protected] 3 points 3 years ago (1 children)

I've been thinking about using nlp to deal with my feeds.

Are you happy with your solution ? Can you share a bit more about your pipeline?

[–] [email protected] 4 points 3 years ago* (last edited 3 years ago) (1 children)

I am not happy with it yet but that is because I want it to be perfect and it never will be but I do find that I engage with content at a larger scale and more varied than I do when I go to a single source. I am using the nltk features from newspaper for key word extraction + the trending sources to monitor a few hundred sources. Currently I store all the meta data + links ( urls ) + wikipedia links in a pandas dataframe ( which is becoming a problem ) and visualize trends and data about news in a jupyter notebook. For the enhanced summaries + named entity extraction I am using spacy (https://spacy.io/) from there I use SPARQL ( https://en.wikipedia.org/wiki/SPARQL ) to query dbpedia (https://en.wikipedia.org/wiki/DBpedia) to augment entity knowledge ( ex: adding data about the size , industry of a company or summary explanations of scientific concepts, etc ). The named entity matching and augmentation is the portion that needs the most work. Newspaper has some nice caching features so I query all sources everyday but only pull in new articles.

I might play around with moving portions of the data into a graph db and some better ways to query based on concepts. Right now I just write python code to query the pandas DB based on different parameters.

Are you happy with your solution ? Can you share a bit more about your pipeline?

[–] [email protected] 3 points 3 years ago (1 children)

Wow that's quite developed.

So you consume content in a jupyter notebook? Or you're interfacing this with a RSS reader?

From what I read the next step is to run it in a real database.

[–] [email protected] 2 points 3 years ago (1 children)

I consume analytics and identify topics I am interested in via jupyter sometimes i just use ipython if I don't want to leave the terminal -- I need to build more of a frontend but I've not got there yet. I mostly read the articles in the terminal. And yup my plan is to find a good db but I am not sure what to use yet.

[–] [email protected] 3 points 3 years ago

You could probably repackages your upgraded feed into a RSS format that you serve locally. But that can be more hassle than it may worth.

Thanks for the info it encouraged me to try that sometime :)

[–] [email protected] 1 points 3 years ago* (last edited 3 years ago)

Yep Feeder is awesome and it is on iOS

[–] [email protected] 6 points 3 years ago (1 children)

I discovered you could find the very latest news by using, what else, a search engine!

https://search.brave.com/news?q=news https://duckduckgo.com/?q=news&iar=news&ia=news

[–] [email protected] 1 points 3 years ago (3 children)

don't use brave search, it's developed by crypto-fascists. use a hate-combating search engine like google and duckduckgo

[–] [email protected] 1 points 3 years ago* (last edited 3 years ago)

And if you don't want your searches to be tracked by Google, Whoogle is an alternative front-end to access Google Search.

https : / / github (dot) com / kroy94 / whoogle-search

Description: Get Google search results, but without any ads, javascript, AMP links, cookies, or IP address tracking. Easily deployable in one click as a Docker app, and customizable with a single config file. Quick and simple to implement as a primary search engine replacement on both desktop and mobile.

You may notice some features are lacking in Whoogle that you would find in Google Search, but for some people it's probably good enough. Whoogle Search includes Google News(as a tab) on the bottom of the search bar after you type in a query and press enter.


Searx is a privacy respecting metasearch engine. You can also search for news there.

Github: https : / / github (dot) com / searx / searx

Searx Instances: https : / / searx (dot) space

Searx Instances(onion link): http : / / searxspbitokayvkhzhsnljde7rqmn7rvoga6e4waeub3h7ug3nghoad(dot)onion/

Disclaimer: It is probably not a good idea to self-host Whoogle/Searx, since it is obvious the traffic to Google/some other non-privacy respecting search engine operator is from one person. But depending on your threat model maybe it can work.

[–] [email protected] 1 points 3 years ago

Of course Brave is bad. But if Google isn't crypto-fascist, i don't know what is. They are an integral part of the military industrial complex and are very cozy with governments worldwide. They are also working on the "technopolice" and promoting AI with all of its shortcomings and biases. About search results, specifically, they are based on a secret ranking sauce privileging bad-quality content... i don't know about english language result, but whatever political topic you lookup for, neo-fascist conspiracy websites (égalité et réconciliation, fdesouche...) are always in the top5 results. How could you defend them?

DuckDuckGo is just a meta-engine and stores little information. Whatever appears in Bing (and maybe others) is what will appear in DuckDuckGo.

[–] [email protected] 0 points 3 years ago (1 children)

A privacy forum where you get told to use google because it does a better job with censoring? Good job guys... I'm out... Have fun with your quality news...

[–] [email protected] -1 points 3 years ago (1 children)
[–] [email protected] -1 points 3 years ago (1 children)

Bye bye pro censorship fascist 😯

[–] [email protected] 0 points 3 years ago

nice projection you got there

[–] [email protected] 5 points 3 years ago* (last edited 3 years ago)

I often go on skimfeed.com just because it agregates a lot of different feeds. I mainly depend on Inoreader (which isn't open-source) to keep an eye on several RSS feeds, but you could host your own RSS feed aggregator like FreshRSS to keep your reading history in sync across devices. I just didn't want to bother with it.

[–] [email protected] 4 points 3 years ago

Movim has a very nice news feed. It allows to subscribe to some of the major news outlets. I'm happily using it for two years now. https://movim.eu/

[–] [email protected] 4 points 3 years ago

For RSS, there are plenty of open-source options to choose from. Flym is a nice option for android. I personally use newsboat pointed to a terminal browser on my laptop, but that's mainly due to my focus on text-based news consumption. I'm sure that there are much more traditional options out there.

[–] [email protected] 4 points 3 years ago

I believe we just have to get used to finding our news by ourselves. Through RSS, bookmarks and good synchronization and integration solutions.

[–] [email protected] 2 points 3 years ago

I use lire on iOS. I’m not sure if it’s on android. Very easy setup for RSS reads in my opinion.

[–] [email protected] 1 points 3 years ago (1 children)
[–] [email protected] 3 points 3 years ago (1 children)

Yes, but it isn't OpenSource

[–] [email protected] 1 points 3 years ago* (last edited 3 years ago)

i'm a news junkie with a journalism degree and there's really not many options, even google had law trouble with their news, i doubt an indie crawler would do well