Aplication Performance Monitoring in Elastic Stack
Elastic APM is a separate component that takes input from APM agents and puts data to Elasticsearch, guide
Elastic APM Agent for .NET is a library that basically runs the apm agent in-process, so there is no need to install it on the host machine.
You might argue that nobody does the "installs" nowadays, and everything is built on containers, so we just run on docker
docker pull docker.elastic.co/apm/apm-server:7.0.1or take a Helm chart
I would agree, using APM in-process and out-of-process results in a different architecture, with it's pros and cons, which we will talk about in another post.
Apm-agent-dotnet writes data to APM Server, and not into Elasticsearch directly.
Apm-agent-dotnet uses two mechanisms to provide automatic instrumentation:
So, there's lots of reasons to be interested in providing custom metrics / data.
Is that easy to do?
Is it even possible without wrapping everything in Transactions and usings, etc?
Can we "simply" hook up using DiagnosticSource and Activity classes, a standard mechanism in .NET?
I've raised a question about that with Elastic team, and a GitHub issue was created.
So, it might be added eventually, or maybe not, but what if you need it, like now?
We can aspire to what apm-agent-dotnet does internally, let's look at HttpDiagnosticListenerImplBase, specifically in OnNext and ProcessStartEvent
We can use a similar approach and use a public method StartSpan - this will automatically persist data in elastic apm server.
That will do the job, if we are ready to instrument our code with DiagnostcSource (firm yes!) and also, map DiatnostiSource and Activity manually to Transactions and Spans (vague eh..)
Hard truth is: no one will do the instrumentation for us.
But, maybe someone can do the mapping?
Elastic APM supports the Open Tracing bridge, (aka Open Tracing Project), so what can we learn from their github and from OpenTracing API Contributions github
The docs are somewhat brief, but when we dig into the code starting with
The comments for GenericDiagnostics say
First off GenericDiagnostcs uses a GenericDiagnosticsSubscription, so let's look how it implements IObservable
Obviously, GenericEventProcessor is the next place we look:
So, Activity class is used here, which is a good thing, but notice that only Activity.Tags are carried over, and Activity.Baggage is ignored. The next sad news is that object untypedArg as you might have noticed is not used at all, and this is the context that was passed to diagnosticSource.StartActivity(activity, context)
(Internally, DiagnosticSource.StartActivity calls Write that is implemented in inherited class DiagnosticListener.Write - and write calls OnNext on all subscriptions - this is the standard "publishing" mechanism)
That does not look good.
Will it get fixed? Unlikely. At the time of writing, last commits to that repo were back in 2018.
But what if we were to accept these issues (and maybe contribute to the project on GitHub later) and use the package anyway.
How can we get this info to Elastic stack?
Looks like we would have to harness Events API of Elastic stack.
We could try to utilize NEST and Elasticsearch.Net, but do they cover Events API?
That seems like a long shot. I'd say it's not worth it.
Well then, how about Azure Diagnostics EventFlow ?
This one has an ElasticSearch sink and DiagnosticSource listener.
I could ramble on about what's inside, but I'll cut straight to the bone - ElasticSearch sink will write data to ElastiSearch, not to Elastic APM, so you won't see your data "out of the box" in Kibana APM section.
So, how do we go about this? Is there really no solution for getting telemetry for already instrumented code out there to some SaaS?
Stay tuned, next time we will look into Azure Application Insights.
Elastic APM is a separate component that takes input from APM agents and puts data to Elasticsearch, guide
Elastic APM Agent for .NET is a library that basically runs the apm agent in-process, so there is no need to install it on the host machine.
You might argue that nobody does the "installs" nowadays, and everything is built on containers, so we just run on docker
docker pull docker.elastic.co/apm/apm-server:7.0.1or take a Helm chart
I would agree, using APM in-process and out-of-process results in a different architecture, with it's pros and cons, which we will talk about in another post.
Apm-agent-dotnet writes data to APM Server, and not into Elasticsearch directly.
Apm-agent-dotnet uses two mechanisms to provide automatic instrumentation:
- ASP.NET Core middleware and in particular ApmMiddleware
- DiagnosticSource and in particular AspNetCoreDiagnosticListener, EfCoreDiagnosticListener, and HttpDiagnosticListener (search https://github.com/elastic/apm-agent-dotnet for these classes)
- Build the data model defined by Elastic APM, in particular Transactions, Spans, Errors and Metrics, as described here https://www.elastic.co/guide/en/apm/get-started/7.0/apm-data-model.html
- Send this data using async bulk upload to APM Server
Having that specific data will allow the APM section in Kibana to show a number of default visualizations, including distributing tracing chart
Now that looks awesome, but we do have to keep in mind, it's just a middleware, and diagnostic listeners for ASP.NET and EF Core, so it does not cover some of the frequently used real life scenarios:- If you are not using HTTP for communication between services (any brokered messaging product like RabbitMQ, MassTransit, Azure Service Bus, that talks AMQP, or WCF, really anything except HTTP)
- You are running some resource-intense operations inside one business transaction, that consists of several distinct steps, and you would like visibility into which step separately
- You're not using EF Core, but perhaps Dapper or NHibernate
- And many more...
So, there's lots of reasons to be interested in providing custom metrics / data.
Is that easy to do?
Is it even possible without wrapping everything in Transactions and usings, etc?
Can we "simply" hook up using DiagnosticSource and Activity classes, a standard mechanism in .NET?
I've raised a question about that with Elastic team, and a GitHub issue was created.
So, it might be added eventually, or maybe not, but what if you need it, like now?
We can aspire to what apm-agent-dotnet does internally, let's look at HttpDiagnosticListenerImplBase, specifically in OnNext and ProcessStartEvent
We can use a similar approach and use a public method StartSpan - this will automatically persist data in elastic apm server.
That will do the job, if we are ready to instrument our code with DiagnostcSource (firm yes!) and also, map DiatnostiSource and Activity manually to Transactions and Spans (vague eh..)
Hard truth is: no one will do the instrumentation for us.
But, maybe someone can do the mapping?
Elastic APM supports the Open Tracing bridge, (aka Open Tracing Project), so what can we learn from their github and from OpenTracing API Contributions github
The docs are somewhat brief, but when we dig into the code starting with
services.AddOpenTracing();Looking into we can see it calls .AddCoreFx() and in turn it calls builder.AddDiagnosticSubscriber<GenericDiagnostics>();
The comments for GenericDiagnostics say
That's exactly what we need, let's looks how it works under the hood./// <summary>/// A <see cref="DiagnosticListener"/> subscriber that logs ALL events to <see cref="ITracer.ActiveSpan"/>./// </summary>
First off GenericDiagnostcs uses a GenericDiagnosticsSubscription, so let's look how it implements IObservable
Obviously, GenericEventProcessor is the next place we look:
So, Activity class is used here, which is a good thing, but notice that only Activity.Tags are carried over, and Activity.Baggage is ignored. The next sad news is that object untypedArg as you might have noticed is not used at all, and this is the context that was passed to diagnosticSource.StartActivity(activity, context)
(Internally, DiagnosticSource.StartActivity calls Write that is implemented in inherited class DiagnosticListener.Write - and write calls OnNext on all subscriptions - this is the standard "publishing" mechanism)
That does not look good.
Will it get fixed? Unlikely. At the time of writing, last commits to that repo were back in 2018.
But what if we were to accept these issues (and maybe contribute to the project on GitHub later) and use the package anyway.
How can we get this info to Elastic stack?
Looks like we would have to harness Events API of Elastic stack.
We could try to utilize NEST and Elasticsearch.Net, but do they cover Events API?
That seems like a long shot. I'd say it's not worth it.
Well then, how about Azure Diagnostics EventFlow ?
This one has an ElasticSearch sink and DiagnosticSource listener.
I could ramble on about what's inside, but I'll cut straight to the bone - ElasticSearch sink will write data to ElastiSearch, not to Elastic APM, so you won't see your data "out of the box" in Kibana APM section.
So, how do we go about this? Is there really no solution for getting telemetry for already instrumented code out there to some SaaS?
Stay tuned, next time we will look into Azure Application Insights.
Comments
Post a Comment