Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telegram crawl depth does not go higher than 1 even if something else is entered #442

Open
stijn-uva opened this issue Jul 22, 2024 · 2 comments
Labels
bug Something isn't working data source Data source-related issues

Comments

@stijn-uva
Copy link
Member

What it says on the tin

@stijn-uva stijn-uva added bug Something isn't working data source Data source-related issues labels Jul 22, 2024
@dale-wahl
Copy link
Member

Two things I noticed:
if crawl_max_depth and (not crawl_msg_threshold or depth_map.get(query) < crawl_msg_threshold): looks off. I think something like if crawl_max_depth and (depth_map.get(query) < crawl_max_depth): is sufficient at this point and the crawl_msg_threshold comes into play later on individual fwd_from channels/groups/etc. Likely just misnamed variable that seemed to work under the right circumstances.

Second part looks like the serialized_message.get("fwd_from") structure may have changed (at least the examples I am finding would never trigger the code as is).

I will send a PR though and am testing now. I can certainly get it crawling, but you may want to take a look to come up with what you want to crawl. I seem to have unleashed something...

@dale-wahl
Copy link
Member

#444

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working data source Data source-related issues
Projects
None yet
Development

No branches or pull requests

2 participants