We use the Consul Service Discovery mechanism to fetch a list of proxies through which we scrape certain targets. There are multiple proxies for redundancy but ultimately they all provide the exact same information.
Now we'd like have the relabeling always drop all but one (random) node returned from SD. It must not be hardcoded as the names and number of proxies can and will change.
After looking at the relabeling implementation I don't think this is possible, but maybe there is some clever hack to achieve this.
Question: Is it possible to drop all but one (random) node from Prometheus Service Discovery?
This is not possible. I'd suggest putting a load balancer of some form in front of the proxies.
Related
I want to have a graph where all recent IPs that requested my webserver get shown as total request count. Is something like this doable? Can I add a query and remove it afterwards via Prometheus?
Technically, yes. You will need to:
Expose some metric (probably a counter) in your server - say, requests_count, with a label; say, ip
Whenever you receive a request, inc the metric with the label set to the requester IP
In Grafana, graph the metric, likely summing it by the IP address to handle the case where you have several horizontally scaled servers handling requests sum(your_prometheus_namespace_requests_count) by (ip)
Set the Legend of the graph in Grafana to {{ ip }} to 'name' each line after the IP address it represents
However, every different label value a metric has causes a whole new metric to exist in the Prometheus time-series database; you can think of a metric like requests_count{ip="192.168.0.1"}=1 to be somewhat similar to requests_count_ip_192_168_0_1{}=1 in terms of how it consumes memory. Each metric instance currently being held in the Prometheus TSDB head takes something on the order of 3kB to exist. What that means is that if you're handling millions of requests, you're going to be swamping Prometheus' memory with gigabytes of data just from this one metric alone. A more detailed explanation about this issue exists in this other answer: https://stackoverflow.com/a/69167162/511258
With that in mind, this approach would make sense if you know for a fact you expect a small volume of IP addresses to connect (maybe on an internal intranet, or a client you distribute to a small number of known clients), but if you are planning to deploy to the web this would allow a very easy way for people to (unknowingly, most likely) crash your monitoring systems.
You may want to investigate an alternative -- for example, Grafana is capable of ingesting data from some common log aggregation platforms, so perhaps you can do some structured (e.g. JSON) logging, hold that in e.g. Elasticsearch, and then create a graph from the data held within that.
Let's say i have this HTTP2 service, that has a list of users and this user hair color, in memory and database well.
Now i want to scale this up into multiple nodes - however i do not want the same user to be in two different servers memory - each server shall handle those specific users. This means i need to inform the load balancer where each user is being handled. In case of de-scaling, i need to inform this user is nowhere and can be routed to any server or by a given rule - IE server with less memory being used.
Would any1 know if ALB load balancer supports that ? One path i was thinking of using Query string parameter-based routing, so i could inform in the request itself something like destination_node = (int)user_id % 4 in case i had 4 nodes for instance - and this worked well in a proof of concept but that leads to a few issues:
The service itself would need to know how many instances there are to balance.
I could not guarantee even balancing, its basically a luck based balancing.
What would be the preferred approach for this, or what is a common way of solving this problem ? Does AWS ELB supports this out of the box ? I was trying to avoid having to write my own balancer, a middleware that keeps track of what services are handling what users, whose responsibility would be distributing the requests among those servers.
In AWS Application Load Balancer (ALB) it is possible to write Routing-Rules on
Host Header
HTTP Header
HTTP Request Method
Path Pattern
Query String
Source IP
But at the moment there is no way to route under dynamic conditions.
If it possible to group your data, i would prefere path pattern like
/users/blond/123
I have an Openwhisk setup on Kubernetes using [1]. For some study purpose, I want to have a fixed number of replicas/pods for each action that I deploy, essentially disabling the auto-scaling feature.
Similar facility exists for OpenFaas [2], where during deployment of a function, we can configure the system to have N function replicas at all times. These N function replicas (or pods) for the given function will always be present.
I assume this can be configured somewhere while deploying an action, but being a beginner in OpenWhisk, I could not find a way to do this. Is there a specific configuration that I need to change?
What can I do to achieve this in Openwhisk? Thanks :)
https://github.com/apache/openwhisk-deploy-kube
https://docs.openfaas.com/architecture/autoscaling/#minmax-replicas
OpenWhisk serverless functions follow closer to AWS lambda. You don’t set the number of replicas. OpenWhisk uses various heuristics and can specialize a container in milliseconds and so elasticity on demand is more practical than kube based solutions. There is no mechanism in the system today to set minimums or maximums. A function gets to scale proportional to the resources available in the system and when that capacity is maxed out, requests will queue.
Note that while AWS allows one to set the max concurrency, this isn’t the same as what you’re asking for, which is a fixed number of pre-provisioned resources.
Update to answer your two questions specifically:
Is there a specific configuration that I need to change?
There isn’t. This feature isn’t available at user level or deployment time.
What can I do to achieve this in Openwhisk?
You can modify the implementation in several ways to achieve what you’re after. For example, one model is to extend the stem-cell pool for specific users or functions. If you were interested in doing something like this, the project Apache dev list is a great place to discuss this idea.
I want to plan a solution that manages enriched data in my architecture.
To be more clear, I have dozens of micro services.
let's say - Country, Building, Floor, Worker.
All running over a separate NoSql data store.
When I get the data from the worker service I want to present also the floor name (the worker is working on), the building name and country name.
Solution1.
Client will query all microservices.
Problem - multiple requests and making the client be aware of the structure.
I know multiple requests shouldn't bother me but I believe that returning a json describing the entity in one single call is better.
Solution 2.
Create an orchestration that retrieves the data from multiple services.
Problem - if the data (entity names, for example) is not stored in the same document in the DB it is very hard to sort and filter by these fields.
Solution 3.
Before saving the entity, e.g. worker, call all the other services and fill the relative data (Building Name, Country name).
Problem - when the building name is changed, it doesn't reflect in the worker service.
solution 4.
(This is the best one I can come up with).
Create a process that subscribes to a broker and receives all entities change.
For each entity it updates all the relavent entities.
When an entity changes, let's say building name changes, it updates all the documents that hold the building name.
Problem:
Each service has to know what can be updated.
When a trailing update happens it shouldnt update the broker again (recursive update), so this can complicate to the microservices.
solution 5.
Keeping everything normalized. Fileter and sort in ElasticSearch.
Problem: keeping normalized data in ES is too expensive performance-wise
One thing I saw Netflix do (which i like) is create intermediary services for stuff like this. So maybe a new intermediary service that can call the other services to gather all the data then create the unified output with the Country, Building, Floor, Worker.
You can even go one step further and try to come up with a scheme for providing as input which resources you want to include in the output.
So I guess this closely matches your solution 2. I notice that you mention for solution 2 that there are concerns with sorting/filtering in the DB's. I think that if you are using NoSQL then it has to be for a reason, and more often then not the reason is for performance. I think if this was done wrong then yeah you will have problems but if all the appropriate fields that are searchable are properly keyed and indexed (as #Roman Susi mentioned in his bullet points 1 and 2) then I don't see this as being a problem. Yeah this service will only be as fast as the culmination of your other services and data stores, so they have to be fast.
Now you keep your individual microservices as they are, keep the client calling one service, and encapsulate the complexity of merging the data into this new service.
This is the video that I saw this in (https://www.youtube.com/watch?v=StCrm572aEs)... its a long video but very informative.
It is hard to advice on the Solution N level, but certain problems can be avoided by the following advices:
Use globally unique identifiers for entities. For example, by assigning key values some kind of URI.
The global ids also simplify updates, because you track what has actually changed, the name or the entity. (entity has one-to-one relation with global URI)
CAP theorem says you can choose only two from CAP. Do you want a CA architecture? Or CP? Or maybe AP? This will strongly affect the way you distribute data.
For "sort and filter" there is MapReduce approach, which can distribute the load of figuring out those things.
Think carefully about the balance of normalization / denormalization. If your services operate on URIs, you can have a service which turns URIs to labels (names, descriptions, etc), but you do not need to keep the redundant information everywhere and update it. Do not do preliminary optimization, but try to keep data normalized as long as possible. This way, worker may not even need the building name but it's global id. And the microservice looks up the metadata from another microservice.
In other words, minimize the number of keys, shared between services, as part of separation of concerns.
Focus on the underlying model, not the JSON to and from. Right modelling of the data in your system(s) gains you more than saving JSON calls.
As for NoSQL, take a look at Riak database: it has adjustable CAP properties, IIRC. Even if you do not use it as such, reading it's documentation may help to come up with suitable architecture for your distributed microservices system. (Of course, this applies if you have essentially parallel system)
First of all, thanks for your question. It is similar to Main Problem Of Document DBs: how to sort collection by field from another collection? I have my own answer for that so i'll try to comment all your solutions:
Solution 1: It is good if client wants to work with Countries/Building/Floors independently. But, it does not solve problem you mentioned in Solution 2 - sorting 10k workers by building gonna be slow
Solution 2: Similar to Solution 1 if all client wants is a list enriched workers without knowing how to combine it from multiple pieces
Solution 3: As you said, unacceptable because of inconsistent data.
Solution 4: Gonna be working, most of the time. But:
Huge data duplication. If you have 20 entities, you are going to have x20 data.
Large complexity. 20 entities -> 20 different procedures to update related data
High cohesion. All your services must know each other. Data model change will propagate to every service because of update procedures
Questionable eventual consistency. It can be done so data will be consistent after failures but it is not going to be easy
Solution 5: Kind of answer :-)
But - you do not want everything. Keep separated services that serve separated entities and build other services on top of them.
If client wants enriched data - build service that returns enriched data, as in Solution 2.
If client wants to display list of enriched data with filtering and sorting - build a service that provides enriched data with filtering and sorting capability! Likely, implementation of such service will contain ES instance that contains cached and indexed data from lower-level services. Point here is that ES does not have to contain everything or be shared between every service - it is up to you to decide better balance between performance and infrastructure resources.
This is a case where Linked Data can help you.
Basically the Floor attribute for the worker would be an URI (a link) to the floor itself. And Any other linked data should be expressed as URIs as well.
Modeled with some JSON-LD it would look like this:
worker = {
'#id': '/workers/87373',
name: 'John',
floor: {
'#id': '/floors/123'
}
}
floor = {
'#id': '/floor/123',
'level': 12,
building: { '#id': '/buildings/87' }
}
building = {
'#id': '/buildings/87',
name: 'John's home',
city: { '#id': '/cities/908' }
}
This way all the client has to do is append the BASE URL (like api.example.com) to the #id and make a simple GET call.
To remove the extra calls burden from the client (in case it's a slow mobile device), we use the gateway pattern with micro-services. The gateway can expand those links with very little effort and augment the return object. It can also do multiple calls in parallel.
So the gateway will make a GET /floor/123 call and replace the floor object on the worker with the reply.
Sorry i'm a beginner in load balancing.
In distributed environments we tend more and more to send the treatment (map/reduce) to the data so that the result gets computed locally and then aggregated.
What i'd like to do apply for partionned/distributed data, not replicated.
Following the same kind of principle, i'd like to be able to send an user request on the server where the user data is cached.
When using an embedded cache or datagrid to get low response time, when the dataset is large, we tend to avoid replication and use distributed/partitionned caches.
The partitionning algorithm are generally hash-based and permits to have replicas to handle server failures.
So finally, a user data is generally hosted on something like 3 servers (1 primary copy and 2 replicas)
On a local cache misses, the caches are generally able to search for the entry on other cache peers.
This works fine but needs a network access.
I'd like to have a load balancing strategy that avoid this useless network call.
What i'd like to know: is it possible to have a load balancer that is aware of the partitionning mecanism of the cache so that it always forwards to one of the webservers having a local copy if the data we need?
For exemple, i have a request www.mywebsite.com/user=387
The load balancer will check the 387 userId and know that this user is stored in servers 1, 6 and 12. And thus he can roundrobin to one of them or other strategy.
If there's no generic solution, are there opensource or commercial, software or hardware load balancers that permits to define custom routing strategies?
How much extracting data of a request will slow down the load balancer? What's the cost of extracting an url parameter (like in my exemple with user=387) and following some rules to go to the right webserver, compared to a roundrobin strategy for exemple?
Is there an abstraction library on top of cache vendors so that we can retrieve easily the partitionning data and make it available to the load balancer?
Thanks!
Interesting question. I don't think there is a readily available solution for your requirements, but it would be pretty easy to build if your hashing criteria is relatively simple and depends only on the request (a URL parameter as in your example).
If I were building this, I would use Varnish (http://varnish-cache.org), but you could do the same in other reverse proxies.