Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream request logs for static.crates.io from Fastly to Datadog #406

Merged
merged 9 commits into from
Apr 23, 2024

Conversation

jdno
Copy link
Member

@jdno jdno commented Apr 16, 2024

crates.io recently implement a few changes that make cargo download crates directly from static.crates.io without hitting the API of crates.io first. This increases the performance and scalability of crates.io, but means that requests are no longer logged by the application. We want to restore the previous behavior by aggregating request logs from our CDNs in Datadog.

The Fastly service has been extended to include more information in its request logs and to stream them to Datadog. We're also tagging the logs with Datadog's Unified Service Tagging to provide visibility across the app (crates.io), service (static.crates.io), and different environments.

jdno added 9 commits March 7, 2024 20:10
Datadog has a few reserved attributes[^1] that have a special meaning on
its platform. The logs that the Fastly service generates now set two of
them, namely the source and the service. Both will make it easier to
process the logs in Datadog's log pipeline.

[^1]: https://docs.datadoghq.com/logs/log_configuration/attributes_naming_convention/#reserved-attributes
The logging implementation for the Compute@Edge service on Fastly has
been refactored so that logs can be sent to both S3 for long-term
storage and Datadog for real-time analysis.
The environment is now passed to the Terraform module as a variable.
This also makes it possible to dynamically derive the SSM parameter for
the customer ID, which was previously hard-coded.
Datadog has a concept called unified service tagging[^1] that connects
data across different parts of the platform. We have added more tags to
each log to make use of this feature.

[^1]: https://docs.datadoghq.com/getting_started/tagging/unified_service_tagging
The prior version of the log was very barebones, which was fine for
capturing the requested URLs, but insufficient to debug the service. The
extended log format includes more information about the client,
protocols, and Fastly service.
In an effort to simplify the configuration of both the Terraform and
Rust modules, some hard-coded constants have been moved into the
Compute@Edge module. This makes it possible to remove the glue code to
pass them from the Terraform configuration to the final WASM function.
While this does introduce some duplication, it removes a lot of
complexity and potential configuration issues.
Comment on lines +17 to +18
const DATADOG_APP: &str = "crates.io";
const DATADOG_SERVICE: &str = "static.crates.io";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, these two static constants were defined in Terragrunt. But they don't change depending on the environment and hardcoding them removes a potential panic if they'd accidentally got deleted.

.referer(
request
.get_header("Referer")
.and_then(|s| s.to_str().ok())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.ok() is used here to prevent issues with the logs causing a request to fail.

@jdno
Copy link
Member Author

jdno commented Apr 16, 2024

@Turbo87 Does the log format still comply with the parsing that crates.io does?

.date_time(OffsetDateTime::now_utc())
.url(request.get_url_str().into())
.edge_location(var("FASTLY_POP").ok())
.host(request.get_url().host().map(|s| s.to_string()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason for including host if it's essentially derived from url?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a really quick experiment with the Remapper processor on Datadog, but that doesn't seem to like using a nested attribute (http.url_details.host) as the source. So I'd say we just leave it in for now for simplicity.

@Turbo87
Copy link
Member

Turbo87 commented Apr 16, 2024

Does the log format still comply with the parsing that crates.io does?

AFAICT there were no changes to the existing fields other than reordering them, so on first glance this looks fine

@Mark-Simulacrum
Copy link
Member

I guess this is a direct stream, right? So this doesn't address #405?

@jdno
Copy link
Member Author

jdno commented Apr 18, 2024

The same log format is used for the stream and the files that we write to S3. So after this change, those will include the User-Agent header as well.

@jdno
Copy link
Member Author

jdno commented Apr 18, 2024

Does it make sense to bump the version of the log format to version: 2 with these changes?

@Turbo87
Copy link
Member

Turbo87 commented Apr 18, 2024

Does it make sense to bump the version of the log format to version: 2 with these changes?

that would require code updates and us deploying those updates to crates.io before the log format can be updated. since the changes are additive and non-breaking I don't think we need a new version.

@jdno jdno merged commit a372426 into rust-lang:master Apr 23, 2024
3 checks passed
@jdno jdno deleted the fastly-datadog-log-tagging branch April 23, 2024 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants