Hacker News new | past | comments | ask | show | jobs | submit login
Opening the code of our X-Pack features (elastic.co)
159 points by Benfromparis on Feb 28, 2018 | hide | past | favorite | 89 comments



This is incredibly user-hostile, and if you read this expecting this is an open-source release, consider yourself clickbaited.

Want to contribute to the open-source version of Elasticsearch? Want to ensure you're running an open-source database without running a custom EULA past your legal department?

Well, make sure that after 6.3 is released, you don't accidentally download the default distribution, or browse the official Github repository, or clone from Github, as all of those will have components subject to an as-yet-unreleased, non-OSI-approved EULA (see the final paragraph in https://www.elastic.co/products/x-pack/open), which may have clauses that trigger liability to Elastic if you accidentally flip a switch or look at the wrong thing. (I am not a lawyer, this is not legal advice, but I shouldn't have to be to know if I can use software freely, and I can only guess worst-case scenarios because the terms aren't released.)

If Elastic were truly committed to transparency, they would release the terms of their new EULA immediately, provide a way to access the head of the open-source portion of the repository without accepting the EULA, and make it clear what the pricing ramifications are if one opts into certain flags or reads certain files in a commercial capacity. (Pricing requires a custom quote, but a quick Google search suggests that the lowest tier in which REST and node-to-node communications are encrypted begins in the tens of thousands of dollars - not the kind of thing you want to take lightly.)

Now more than ever, it's important that projects like https://github.com/floragunncom/search-guard , a free and open-source (Apache 2.0) alternative to some of these X-Pack components, are supported, and that the word is spread around them. (EDIT: Yes, they have commercial components, but those are in a separate repository, so you avoid any entanglements. Why isn't Elastic doing the same?)

And hopefully, if Elastic doesn't maintain a fully-OSS repo, someone will create and maintain a friendly fully-OSS fork that can be combined with third-party software to carry on the legacy of this otherwise excellent database in the right way.


One of the awesome things about our products is that people care deeply about them. There is one thing, in particular, I want to address.

'If Elastic were truly committed to transparency, they would release the terms of their new Eula immediately...'

We are working on it. This change is not one that was taken lightly and we have to ensure we have the right language for when we do release the license, etc. We will put it online as soon as we are able.

We are committed to transparency and, also, we are committed to our users.

Yes, we are asking for a fair bit of trust...and I hope we continue to prove ourselves worthy of that trust.

<disclaimer: I work at Elastic in Developer Relations>


Sorry if the following sounds like a personal rant: it is.

You're saying that people care deeply about your product. In reverse, I don't get a feeling you care much about the community around your products. Since you're working at developer relations, I'd like to point out that you're still funneling people to your IRC channels via https://www.elastic.co/community, but once people arrive there, they barely get any help from elastic people. Despite multiple requests, the channels are still not logged or searchable, so questions get asked over and over again. I'm a long-time lurker there and especially questions about how to contribute to the projects end up in silence. People asking how to work on tickets (in the tickets) as part of a university course or as part of a Gsoc assignment: silence.

There's barely anyone offering guidance on how to contribute to the open source project beyond "you'll need to sign the CLA." The CLA is contributor-hostile. Anything people touch as part of their work cannot be contributed unless they get legal involved - a show stopper for many. The CLA, at least in some versions required full copyright assignment and indemnification - I, personally, can say that it stopped me from providing any kind of fixes or improvements.

I used to run the Berlin elasticsearch usergroup which started out before elastic, the company, even existed. At some point it used to be one of the largest ES UGs world-wide, still, any kind of support from elastic beyond personal support from some developers: nonexistant. Heads up and infos for feature announcements so that we could prep a talk fitting the announcement: Not there. Sending a speaker for an announcement? Impossible. At some point, elastic even tried to charge us for the privilege of running a UG. The best offer we received was an offer to pay for pizza. We did get a honorable mention at the first elasticon, IIRC.

> Yes, we are asking for a fair bit of trust...and I hope we continue to prove ourselves worthy of that trust.

Good luck. I've heard these words before. Any kind of interaction with elastic, the company, that I had as a community member was borderline hostile. I wish things would improve, but I've pretty much lost faith.


I have lost faith as well, but in my case it might just be that Elastic are working on too many integrations, or that Rails is just not a priority for them, but the official elasticsearch-rails gem is set to support ES6.0 by Q2 or Q3 in 2018[0], remember that Elasticsearch 6.0 was released November, 2017.

[0] https://github.com/elastic/elasticsearch-rails/issues/756


On the other hand,

Their discourse community is very active. ElasticSearch people help out there a lot. They are people in the end, and they can't be everywhere. They are spread thin and they do their best. Open source doesn't mean super human. Lack of resources doesn't mean "hostile". If you think not being able to respond is hostile, then wait till you really see the hostile part of the world.

For everything ElasticSearch does, this is a very critical and negative outlook.


> They are people in the end, and they can't be everywhere.

I agree, but they mention IRC as an official contact and support channel. Either do that and then support it, or don't. But you can't say "you can get help on IRC" and then not be there, that just does not work. People do actually come to the channel and expect answers and they do treat volunteers there like they're paid support staff - an expectation that is understandable given that it's listed on the official page.

> Lack of resources doesn't mean "hostile".

No, but I didn't say so. I said the interactions were borderline hostile. I won't recount the episodes here, though, but generally they were about asking us to go out of our way to organize things and then drop the ball. If you ask me to do stuff for you for free, then you'd better follow through. Lack of resources is no longer an excuse then.


yes, indeed. Discourse would be a good place to ask questions. We can't answer all questions, especially long "did I design my domain indexes correctly", but we strive to help and answer questions about Elastic.


Since I have an ear here, I have to say I hope the EULA doesn't require arbitration in Delaware like the Basic license does. These sort of terms make this a no-go for us - same as when Atlassian required mandatory arbitration in (New Zealand?); out lawyers refused to let us use JIRA because of that, and other clauses in their license.

I'm about to uninstall x-pack for the same reason. Hopefully we can use the 6.3 release, but it will depend on the terms.


Agreed. I can't believe how combining Apache and non-apache licensed source code in a single repository is a good thing.


If you look for real FOSS ES security i recommend Search Guard https://github.com/floragunncom/search-guard . This is a ASL2 licensed repo which provide all kinds of basic security for Elasticsearch including SSL/TLS. So the content in this repo are real fully "open source".

Then there is a second non-FOSS repository https://github.com/floragunncom/search-guard-enterprise-modu... which contains all the enterprise/advanced stuff. The code is open but not "open source".

So FOSS and and non-FOSS/proprietary are well separated. I don't understand why elastic mix and merge FOSS and non-FOSS code in the same repo with a weird mixed license applicable for special folders. This all makes no sense to me.

And there are several other FOSS projects which provide x-pack like functionalities described here https://sematext.com/blog/x-pack-alternatives/ like elastalert for alerting or sentinl for reporting.

If one is really comitted and addicted to FOSS then i believe its better to go with the FOSS x-pack alternatives and support and contribute to them.

Just my 2c.


So a developer could make substantive contributions to XPack, but still owe Elastic money when they deploy to prod?


Thats exactly why it will not work


Here is the EULA: https://www.elastic.co/eula


Yeah so has anyone actually tried to get ElasticSearch up and running lately? I just tried and had a terrible time, despite the fact that I was using ElasticSearch + Kibana, and it was dockerized, and it was on Kubernetes (there's more complexity, yes, but all those tools make deployment simpler once you understand them, not harder -- writing a pod resource config to get a thing running means I don't have to run around my system changing settings, I just put all of it in one place). XPack was just another stumbling block while trying to get everything running.

The combination of lack of documentation, inconsistent/changed configuration (ENV vs YAML vs values that just don't exist anymore), breaking changes between versions that rendered Kibana completely useless, and the recent (?) removal of plugins that expose web APIs (so I couldn't use something like elastic-head. This is all in Kubernetes btw -- maybe it's just that I wasn't smart enough to get it done, but it's so easy to write functional (if not well-configured) configurations for other databases, I was at a loss for words when nothing I tried worked right.

I got so angry trying to set up ElasticSearch that making a F/OSS competitor is now #2 on my list of projects-to-do-next. I'm sure the thought is naive but I need to find out for myself that there's no easier way.

Imagine if the team behind Prometheus had focused on search instead of metrics? That's the kind of tool I want to use. A tool as focused, easy to start, clearly documented, and straightforward as prometheus.


"making a F/OSS competitor"

So, Solr? Good luck getting SolrCloud set up on Kubernetes. ;-)

More seriously though, my answer to "has anyone actually tried to get ElasticSearch up and running lately?" is yes. I just worked on spinning up a cluster (using docker) at my current job. At my last two jobs I also managed ElasticSearch (without docker). There are plenty of gotchas with ElasticSearch, but I've never found the initial setup to be a challenge. To be fair, I've never touched X-Pack.


Call be insane +/- naive, but I was actually thinking of "just" gossiped/quorumed SQlite+FTS5.

In the end I got elastic search running, but it wouldn't connect to Kibana properly. I exaggerated too much -- much of my frustration was with ES not working properly with Kibana. I kept notes on what went wrong/what I was struggling with but I don't even want to look at them now, they'll be in a blog post someday


You'd need to handle concurrent writes, so something like a WAL, so why not build on RocksDB?

And okay, quorum, and sure there are a lot of Raft libs out there, but it's a bit harder than "new Cluster(Consistency.QUORUM)" :)


The thing is, I don't want to build search myself -- SQLite has a WAL (of course), runs in memory if you want (of course RocksDB has less holding it back from utilizing memory even more efficiently than SQLite could), and most of the creature comforts of SQLite, and I can lean on SQlite FTS search.

All I have to get right is the quorum (I'm actually thinking optimistic gossip with something like swim over a quorum with paxos/raft), and the sharding, and replication -- and that stuff has been worked through by people much smarter than me already.

The formula I think will work is basically SQLite + SWIM/Raft + consistent hashing algo + optimistic replication + optimistic rebalancing. Just about 100% of the things on that list I don't have to think too hard to implement, and should be performant in the happy case (where n/2 nodes are up and healthy and relatively performant)

Recently saw a talk from fosdem 2018 (https://fosdem.org/2018/schedule/event/datastore/) about a project called Timbala, learned a bit from it (for example, assumed everyone was just using paxos/raft but SWIM evidently is used by consul)


Here's a quick plug for the project I'm working on at the moment: https://github.com/jetstack/navigator

It's a framework for managing databases on kubernetes, with initial support for elasticsearch and cassandra. It's still early in development, but any feedback would be great.


kube-lego & certmanager are amazing, thank you for the work you put in @ jetstack.

I want to give your tool a try, but a custom api server seems like a lot in the way of complexity (I thought operators were at most beefy controllers?, do custom api servers fit in the "operator" pattern?) -- and I literally only want to run Elastic so I can complete the EFKK stack (and try it out).


FWIW, I've done the same recently (deploy ES + Kibana on K8s) and it pretty much just worked. Statefulset, EBS volume claims, official Docker images.

Didn't use XPack or anything fancy though, haven't updated it and the only addon I'm running is https://github.com/lmenezes/cerebro


yeah it's becoming increasingly clear that I must have tripped myself up/got frustrated too fast.

When you set up cerebro, how did you set up the CORS headers? Did you go with allow "*"?


I totally agree...It's crazy that they don't even offer a docker-compose file of sorts to being all their own tools together to demo the power of their own tools.

I recently wanted to see ELK in action...and it took me a few hours to set it all up and configure everything together with just basic docker and docker-compose. It really should not be that hard :/


We try to get you most of the way there with https://github.com/elastic/stack-docker/blob/master/docker-c.... It takes care of x-pack as well.


My team created a helm chart for this. EFK soup to nuts with x-pack. I'll see if we can't publish it.


Since you've done it, perhaps you could share :-)


I will soon! (I did it for a work related project so I want to make sure that I go through a few formalities to make the repo public).

It's a pretty cool repo - includes templates for ELK on K8S, ECS and Docker Swarm (Compose).

I'm working on testing some HA on it next week!


Graylog is a good alternative, check it out.

If OpenShift didn’t do the heavy lifting of deploying and securing Elasticsearch I wouldn’t be using it at all, and because of that mess I actually use Graylog in my lab at home because it’s substantially less of a pain in the ass and security isn’t a feature locked behind a proprietary license or writing your own proxy.


Graylog is exactly what I was reaching to go with (I've had a similar experience and was blown away and delighted by how easy it was to use) with but it's a bit heavy weight -- they (supiciously) don't have min requirements anywhere, the only scrap i could find was on the graylog open stack docs where they suggest you have 4GB of RAM free.

The machine I'm running on isn't small, I have the memory, but it just feels like a slipperly slope.

Also, Graylog = Java + Mongo + ES and I'm almost philosophically opposed to using Mongo for personal reasons (this is a personal project so I can afford to have some self-defeating bias).


The prebuilt Graylog virtual machine appliance (OVA) defaults to a pitiful amount of RAM (I think 512MB? 1GB?) and we used it in production successfully for a very long time in this configuration. We bumped it up but just because it seemed like a good idea, not because the memory was giving us any trouble. From our Graylog dashboard currently:

> The JVM is using 637.8MB of 980.1MB heap space and will not attempt to use more than 1.4GB

It also defaulted to a single vCPU, which seemed to be fine. It seems like Graylog can scale down pretty well if needed.


Does this include what you're giving Mongo + ElasticSearch? The graylog process isn't all I'm worried about, it's kind of the combination of the three.

Regardless, I'm probably going to just use Graylog then -- I'm not running a large environment by any means, and while I've been at a company where graylog was used in production (which is where I heard about it), people often complained about it hogging resources. Time has passed, and I'm sure that if it's good enough for you, it's more than good enough for me (especially since I'm not running anything "in production").

I still want to get the EFKK stack up and running though, right now there's basicaly two choices, ELK/EFK or Graylog or some hosted option (splunk, sumologic?, others), I'd like to at least stand up both choices once and get a feel for them (and I've done Graylog before).


Splunk’s not a bad piece of software, I just prefer open source options before proprietary solutions where feasible (which is why I don’t use EFK, I refuse to pay money for security and I think it’s bullshit that Elastic has made that part of their business model with the xpack) but for small environments the free version can get you far.


Not in any way affiliated with Elastic but XPack is now included in Elastic by default, so there's that -- of course it does say something that they included it in their enterprise offering first.

Same here on the open-source-first mentality. I also managed to get the EFK stack working so now I don't feel bad actually choosing Graylog in the long run.


Not all of the xpack features are free, security still requires a gold subscription with Elastic. In fact, there’s very little functionality in the xpac that DOESN’T require at least a gold subscription.


Graylog doesn't require tons of memory in my experience, it always benefits from more as your logs grow - but that's just a fact of life when it comes to any kind of database. I've run it on 2GB of RAM before (this is just the smallest amount I ever give a VM because that's what it takes to netinstall CentOS 7 these days) without issue on smaller amounts of logs (10-20MB/day).

I'm not a fan of MongoDB myself, but Graylog uses it as not much more than a distributed configuration store so I just begrudgingly accept it.


Would you mind sharing the resources alotted to mongo + elastic search? I'd consider those under the umbrella of Graylog


I don’t have statistics as it was used in my lab at home which I have recently torn down and begun rebuilding (new servers, new hypervisor and not enough of a crap given to v2v the VM’s I had instead of reinstalling).

From memory though, MongoDB didn’t use much since it mostly stored configuration for Graylog, the Graylog processes themselves took up a couple hundred MB and elasticsearch ate up everything I allowed it to (typical behavior of a database though).

I didn’t bother tweaking any of the settings and just relied on memory pressure of the VM everything ran on to limit resource usage, if you’re keeping lots of history and need fast access to it then obviously you’d need to give ES more RAM to work with.


Thanks for sharing -- I wasn't aware that Graylog only used mongo for the configuration information -- sounds like they're using it as a synchronization option... Wonder if they're working on any alternatives like etcd or even kubernetes-native synchronization options... After a little looking it looks like the answer is "no" (https://community.graylog.org/t/will-mongodb-ever-be-replace...)


It would be amazing to see a Golang based search project startup with a clean codebase as a starting point. I would work on that project for sure!


I could agree with that. Probaĺly is this a option to look at: https://github.com/blevesearch/bleve


>Yeah so has anyone actually tried to get ElasticSearch up and running lately?

Actually, yes. I just finished doing our migration from ELK 1.7 to 6.1.3.

We're using installs direct on VMs (rather than docker), and for that we push the configuration/install using Ansible. Their Ansible role[1] works reasonably well for installing Elastic. The Kibana and Logstash configurations were done using regular RPM install from the repo.

[1] https://github.com/elastic/ansible-elasticsearch


Well clearly I didn't try hard enough -- the ansible roles look perfectly reasonable. A quick look through the notes I took and my biggest problems were with:

- Close versions of ES+Kibana not working together

- maxConcurrentShardRequests not being set on Kibana for some reason (so when I got them talking, a silly query parameter was holding everything up)

- I wasted a ton of time due to some files from a failed installation causing an obtuse error -- I think it was a NoShardAvailableActionException


> Well clearly I didn't try hard enough

Well, I had the advantage in that I already knew I wasn't touching it on Docker with a ten foot pole, and we use Ansible, so that made my google search pretty obvious.

> Close versions of ES+Kibana not working together

Yep, that's a pain in the arse, and a trap for inexperienced players still.

Also of note is that the latest versions available through the package repository are not the same as the latest supported by the Ansible role. The ansible role will install a specific version of Elastic, you'll have to be careful to take note and synchronise that with the versions of Logstash and Kibana you install. (This is why we're on 6.1.3)

> - maxConcurrentShardRequests not being set on Kibana for some reason (so when I got them talking, a silly query parameter was holding everything up) > - I wasted a ton of time due to some files from a failed installation causing an obtuse error -- I think it was a NoShardAvailableActionException

yeah, can't really help with either of these two - I already had a working ELK1.7 install, so for us it was pretty much a case of stand things up, and perform some modifications to templates/queries/etc, and off we went.


> Well, I had the advantage in that I already knew I wasn't touching it on Docker with a ten foot pole, and we use Ansible, so that made my google search pretty obvious.

But the thing is, docker shouldn't actually make things that much harder -- it's just the same old process + namespaces + cgroups. In theory not that much is different, I'm not sure why reality so often doesn't match up.

> Yep, that's a pain in the arse, and a trap for inexperienced players still.

Yeah I got mega trapped. At one point I started walking back versions, trying them in lockstep (to get away from the maxConcurrentShardRequests and the NoShardAvailableActionException issue, before I realized that the latter issue was due to stale data on disk). I started bouncing between docker repos for this stuff -- elastic stopped publishing to dockerhub, but there's images like blacktop/kibana and bitnami/kibana who that still exist. Once I try again with a clear head I'm sure it will be easier.

Yeah I actually filed a ticket on the maxConcurrentShardRequests thing -- it seems like a real bug and it's waiting for triage.


I just want to note that I was likely still exasperated from the defeat of not being able to install ES properly (which was likely my own fault as many others have been able to install it just fine), and this post should be taken with a grain of salt.

People are hard at work on ES and they're sharing their progress with the OSS community (the background behind X-Pack aside), and maintaining an OSS version and I'm grateful to them for that.


We rely on it - Elastic 6.2.2, logstash latest - we forgo kibana. But to be fair, we completely repackage this into our own dockers, to make life better.


How do you watch your logs? I couldn't for the life of me find an alternative to Kibana that interoperates well with ES.

Grafana should be possible, but it just seems like no one uses grafana for just plain log watching.


We have our own closed source product - it's a key part of what we're building.


I've spent at least 2 weeks this year trying to get Kubernetes and logstash/elasticsearch to work together with an endpoint. One week on getting the golang client to deal properly with the changing ip on a restarted elastic pod (solved), and one week on logstash doing it (unsolved), with x-pack mucking up things royally.

I wish I was angry, but I'm just defeated and annoyed.


I'm sure that you've seen this already but just in case you haven't:

https://github.com/kubernetes/kubernetes/tree/master/cluster...

I didn't follow it to the letter because I'm stubborn but


I have seen it, it doesn't address the problems with logstash, or the golang elasticsearch client.


Elasticsearch is a mess. It's so full of historical warts.

One major problem is that none of their documentation is actually reference documentation -- if you look for the formal schema (for things like mappings and the query DSL), the list of endpoints and their allowed parameters, the full list of settings etc., you won't find them listed anywhere. For example, does "keyword" mappings support the "enabled" property? What does the "index_options" setting actually do when combined with the "index" setting? Hard to tell any of this without trying them out. Turns out "dynamic_templates" mappings support any combination of the above, and will never complain about invalid combinations, whereas property mappings do. The whole environment variable vs Java property mess that you mention also exists.

They do deserve credit for trying to clean it up. The last few releases have been pretty brutal in how they've been deprecating (and later removing) legacy features and tightening the semantics, the newest and most dramatic of which is the deprecation of multiple type mappings per index. And they've been pretty good at explaining what's going to happen. So the warts are getting fewer. On the flip side, you have to follow the release notes religiously if you want to keep up to speed, since each release now tends to remove a bunch of features or add strict validation where there previously was none, and it becomes harder to upgrade. (If you want an important bug fix that hasn't been backported, things could get expensive.)

It's interesting how the Elasticsearch team let their focus be derailed by this new industry obsession with analytics and logs. It's not something ES was originally built for, and it turned out to be good at it mostly by accident. It's not terrible at it, but Elasticsearch shines the most for its original purpose, as a content index with rich full-text search capabilities. (Areas where it works less well include scaling edge cases such as high-cardinality aggregation buckets and high numbers of unique field names.) I wish they'd rather worked on things like joins and fixing the need for the "nested" object type, which is a ridiculous hack, but since those things aren't needed for analytics/logs, they haven't happened.

(Pet peeve time: One problem that rarely gets mentioned is that Elasticsearch's "eventually consistent" model has two parts. There's the part where replicas may be out of sync with primaries, but there's also the problem that on each individual node, index operations don't become visible to queries right away, not until the next segment "refresh", which by default happens every second. There's no API to ask about the refresh state, so right not the only way for a read followed by a write to be consistent is to ask the write to wait for refresh (or force a refresh), which is the opposite of what you want; the wait should be on read, not write. Given that ES now has a sequence number associated with shards, I'm surprised they haven't tied those numbers together with refreshes so you can ask about which sequence number the index is currently "at".)

So I think Elasticsearch is definitely ripe for disruption. I don't know of anything else that is able to compete at the moment, at least not in a single package; Solr isn't really in the same league.


One of my primary grievances with ES is that all security is a (paid) add-on. TLS, even most basic authentication, doesn’t come out of the box. I really expect that from a modern product. (Yeah, I know search-guard exists, it’s still an add-on)


It’s surprising when someone expects consistent ops from elastic when it’s built on something that has none of it (lucene).

At least solr doesn’t pretent to be something it isn’t (database).


>At least solr doesn’t pretent to be something it isn’t (database).

I worked on implementing ElasticSearch at my company and one of the things they mention clearly is ElasticSearch should NEVER be used as source of truth (primary database).


I think I understand where the OP was coming from.

ElasticSearch + Kibana often gets positioned and used as an open source alternative to Splunk. In that context it is in all respects operating as a primary database since often the source logs are transient.


Performance is also a black box. Super fast on small datasets but at scale.. better hope you can pay for that platinum support contract and be prepared to not use all the fancy features like collapse.


But scale is literally why people use ES right? it's expects a cluster almost out of the gate, I felt like I was using it wrong even trying to run only one instance on one machine.


Well, as a NoSQL search engine. But it implements sexy features with heavy performance penalties.


solr is pretty amazing and, for me, more accessible and easier to use


Yeah I copied and pasted the RPM snippets on their install package, installed them and had it up in no time.

This was a KVM VM though, not Docker, but I found the documentation was fine.


Background: They introduced X-Pack by bundling it with the default distribution as a time-limited trial without explicitly stating that it is just a demo. People who did the update were bitten weeks later when it just stopped working because the demo license had expired. [Documentation of this was ridiculously bad and I only learned through this post that there apparently was some way to get a free license (maintaining an instance for a 501(c)(3), I just assume we'd have qualified).]

This looks like an attempt at fixing their karma balance, but until I've reviewed the EULA I am pessimistic about the value of "allowing for some derivative works". And I don't really get the "allowing for some [..] contribution". <zyn>Is a patch that improves performance by 10% welcome but I'll have to pay them to make them accept a patch that improves performance by 100%?</zyn>


Disclosure: I work at Elastic, primarily on the X-Pack features in Elasticsearch

> They introduced X-Pack by bundling it with the default distribution as a time-limited trial without explicitly stating that it is just a demo.

I assume you're referring to the Elastic docker images, as I believe that's the only place where we've ever bundled X-Pack without any explicit opt-in (Our windows installer also includes X-Pack, but it's clearly marked as an optional & commercial component).

We totally underestimated the confusion and difficulties that would cause, and it was fixed with the 6.0 release.

The difficult was balancing the different needs of different users, with the constraints of how docker containers are typically managed. X-Pack requires a file-system level install, which is not something that docker users expect or want - an image should be built once and then be essentially immutable. No one likes to have to enter the container in order to install a new plugin.

Since it was possible to disable X-Pack functionality, or install a free license for a subset of the features, it seemed like shipping the container with X-Pack pre-installed and letting users dial it back as needed, was the better option compare with shipping without X-Pack and forcing customers to reconfigure their container so that they could get the features that they needed.

We didn't expect that there would be so many users for whom Docker was the primary/initial point of contact with our stack. We believed that we would be mostly working with users who already understood what X-Pack was and what they were getting. When it became clear that that wasn't true, we had to come up with a solution, which is what we shipped in the 6.0 release. We didn't want to change the behaviour in a minor release and cause more confusion for users who were relying on X-Pack being installed, so it wasn't simply a case of changing the way the images.

The X-Pack licensing code was built on the premise that X-Pack was a plugin the was explicitly installed so that those who installed it would know what was going on. And when it was written, that was true. One of the consequences of that assumption was that it would automatically generate a trial license on start-up, and then you could install your own purchased license after the fact. In order to offer Docker containers that worked the way users would expect, we had to make changes to that so that X-Pack could be installed, but default to only enabling the free features, so since 6.0 we are now able to provide 3 different docker "flavours" - pure open source, basic (free license), platinum (trial license for paid features).

We do want people to use X-Pack. We believe that the free (basic license) features offer something useful that users should know about, and the success of the company relies on users knowing that our commercial features exist, and being able to evaluate them and decide if that's something they want to purchase. But the docker situation was never intended to be a "bait-and-switch", it was just a problem that caught us by surprise and took some time to rectify.

For historical accuracy, the inclusion of X-Pack in our docker images was not the introduction of X-Pack, it had been available for quite a long time before that and the underlying commercial IP was several years old before we started publishing Docker images.


First, thanks for your response, it appeared much more straight forward and honest than the press release to me. Sincerely, thank you!

I personally was running an instance on FreeBSD and just went with the latest port after a quick glance at the docs. There was a new "lite" (or similar) port that did not include xpack, but after skimming the docs I saw no reason not to use these new features (even though in my setup they don't really provide any additional security). A few weeks later it all stopped working, some docs were broken (I checked against the source) and then I saved the rest of my weekend by simply rolling back to the previous setup.

To sum it up: My decision to move from Solr to ES due to much nicer configuration backfired and I effectively lost time compared to just using Solr (which I know but despise). But some early ES blog post (IIRC) indicated that the ridiculous setup of Solr was the initial motivation to write ES in first place, so it is a sad twist that ES messed up exactly that :(


How would you do a time limited trial for an immutable widely distributed Docker image?

There either has to be some fingerprinting and a phone home, or a licence you "manually" set to the env.

Since I am unaware of any sufficiently reliable host/cluster fingerprinting opportunity in say Kube, I assume it needs a licence to be configured in the pod yaml file.

What I don't understand is how this confusion can arise given the explicit licence installation.

Or where did I go wrong?


It depends on what "immutable" means to you.

Elasticsearch is stateful, it's essentially useless if you don't have some form of persistent storage for it, because its purpose is to act as a datastore. So we make no attempts to treat the Elasticsearch containers as if they're stateless in the "no permanent storage" sense, but you can point your Elasticsearch data directory to a mounted volume and keep that state away from the core image.

But having an immutable image is a bigger deal. The expectation that users have of docker images is that you don't need to edit anything on the image files themselves, but that the configuration is provided from an external source, typically through environment variables.

Installing X-Pack is an image change. It stores new files in the plugins, bin, and config directories of Elasticsearch, and will typically require changes to your main cofiguration file. That's something docker users generally don't want.

Installing a license is a state change. The license is loaded with an HTTP API, and we store that data in the data directory alongside the stored documents. So for most users it's not a violation of their "immutable" expectations.

The other expectation is that you can simply start the container, and run it to completion and then terminate it - you don't want a process that expects to be started, and then reconfigured, and then restarted, etc. That affects other aspects of how you can configure Elasticsearch, but doesn't impact the licensing issue quite so much.


Thank you for the explanation. Data directory shared across all pods is the answer then.

Given that the process was automagic, I can see the potential for confusion.

Seems you are handling the fallout nicely though.


The important part is near the bottom, free components of X-Pack will no longer require the cumbersome registration process. Also of some concern is that extra care will be required to find a version of the software that is Apache 2.0 licensed as the proprietary licensed X-Pack components will be included by default in the code base instead of as a separate install.

---

"Also, X-Pack features will now be bundled into the default distribution. All free features are included and enabled by default and will never ‘expire’, and commercial features are opt-in via a trial license. The license for free features never expires,you no longer need to register to use these capabilities. In addition to this, an Apache 2.0-only distribution will be created for download."


Here we have a company that is simply opening up visibility of the code of their commercial offering, which will enhance customer relationships vs. keeping the code black box —— and it's observed as user-hostile?

Open core isn't evil. Companies need to make money, and shouldn't have to choose between fully-open-source vs. closed-source.

Fully-open-source companies have a really hard time of building a good company, because despite the great work they put out into the world, other enterprising cloud companies can build closed-source cloud services around it. For example, I've paid Compose.io and other companies real money around MongoDB, and never a cent to MongoDB the company. Same goes w/ Docker, who has struggled to build a great business because they open-sourced so much of their value.

So yeah, Elastic is trying to make it more likely that you'll choose to buy a license, because the health of their business depends on getting paid customers.


Eleven uses of the word free, all used in the gratis, not libre, sense of the word.

'Open' is in the same category for me now as 'cloud' -- too nebulous, too little consensus on what it actually means, often used in a 'don't worry about the details' hand-wavy way.


This thing of having the source "out there" (but not OSS) worries me and is very confusing for users because it creates dangerous grey areas.

If you think like me, check out the actually FOSS(GPLv3) alternative to X-Pack security I created:

https://github.com/sscarduzio/elasticsearch-readonlyrest-plu...


Their licence is still proprietary, though. At the least people with a technical curiosity in X-Pack will be able to install it. Technically it's illegal, but Elastic probably doesn't care until the moment they start gaining momentum.


If their license is still proprietary what do they mean by "opening" the code?


They mean you can see the code. Note that they never say they're "open sourcing" it - it's not open source by the Open Source Definition.

This is the license the X-Pack code is under: https://www.elastic.co/eula


I wouldn't put it past the Elastic team to include some sort of phone-home mechanism here.

Nor would I put it past them to consider legal demand letters as part of their sales funnel.


In 6.3, telemetry (‘phone home’) is an opt-in only feature for both the free and paid components of X-Pack. We won’t be tracking any identification, so even if users opt-in to telemetry, there is no way to turn that into a sales strategy.

Suing users is a pretty terrible business practice and not, at all, a funnel creation opportunity.

Re. License, there are a few FAQs on https://www.elastic.co/products/x-pack/open#faq that may help clear up any misunderstandings. Free features are free and governed by the Elastic EULA but are enabled by default. Paid features are available through an opt-in trial and, if they help solve a problem, can be purchased.

<disclaimer: I work at Elastic in Developer Relations>


and this trial can't be disabled (to use them for an unlimited amount of time) by someone by modifying your source because your license prohibits that? Am I correct in that assumption?


Based on what exactly?


Based on a previous experience being in their sales funnel and speaking to their reps at re:Invent.

They have an incredible product that I can't live without, it's just incredibly painful trying to actually pay them money for something.

Honestly, I think their real play here is getting community contributions to fix stuff that's very broken and also to edge out Siren Solutions. SearchGuard, Kibi and Sentinl are all _very_ good tools.


I'm sorry to hear it was painful to pay us...I wish that wasn't the case.

If you want, I'd be happy to talk more about it so that we can understand where we can improve. Feel free to email me tyler.hannan<at>elastic.co

<disclaimer: I work at Elastic in Developer Relations>


Not OP. One reason i find it is hard to pay you is because there is no way to buy your product without talking to someone. The pricing is not listed anywhere on your website.


That changed Jan 1 while were in the middle of figuring out what we were going to pay them -- then some X-Pack features disappeared from the "Gold" plan and the cost they quoted us was _way up_ from what we were evaluating.

There actually still is a pricing page somewhere on the site, but you have to dig for it, and I don't think it's correct anymore.


I think open sourcing the code give opportunities to others to learn from the code and create own alternatives.

What drive the community to involve in their software


The main benefit may be that users can more easily use the code if they can read it. If you can see the code then you can understand how it works exactly, without relying on documentation to be accurate. You can also more easily debug your application if you have the source of third-party libraries.

For writing clones or alternatives, it might be better not to read any proprietary code: https://en.wikipedia.org/wiki/Clean_room_design


You have a reason. That will be one of the main reasons. Sometimes I use open source projects to find a way to create something.

Check many solutions can give an idea how to solve the puzzle. Make this projects open source can give me an idea to create plugins. Ofcourse still need think for yourself if it fits your needs, quality and if you not are struggle with possible copyrights.


I'm a cynical es admin, but I'm pretty sure this is just a polite way of saying "you're still gonna need support" more than it is "it's good to go, run it yourself!"

Between ES and RMQ, I find myself running software written before the cloud was big, but trying to be the drivers of the cloud -- those two pieces consume more resources, both physical and operational then the rest of our stack three times replicated, combined.


Dear Elastic people reading this thread: We hope you wont do evil with your licence.

If you don't do evil (e.g. write anything that could mess other people IP when working with Elasticsearch, prevent compatibility with other products, scare people from even looking at these codes or at the whole repository ) it will be.. a setep in the right direction.

If you do evil it will cost you dearly in the end, with FUD about companies should be enormously weary of having anyone even look at those repositories.

Hope we'll continue to cheer and celebrate your achievements.


Saw this in my inbox yesterday and was excited... until I read it. Nothing changed. Ultimately Elastic is still looking for a revenue model beyond consulting that makes sense.

Our company needed to build a bunch of this ourselves, simply because of the bundling cost. At the end of the day many, many organizations would love the ability to PURCHASE A SINGLE PLUGIN not all of x-pack. Ignoring the cost - the reality is that x-pack is useful in large enterprise, and a reasonably tiered model to develop against would help startups get going, and tie us in for the long-term. Instead, we built our own.

The sheer number of times I've had direct conversations with people at Elastic (from CEO down) about this makes me cringe. We rely on Elastic. We currently have our own ecosystem around it. We're also going to (over time), probably have to open source bits and pieces, so that others don't have the same pain.


Being honest X-PACK is just a pain for the small developer, it should be opt-in by default and only installed by the user not the other way around, I would probably be using some parts of it like that.

I used to love Elastic even have a couple of small pull requests merged. I've felt bad for using such an awesome software for free and not giving back. But X-PACK changed that, I am now a bit afraid of putting my trust in the software and the first thing I do is uninstall X-PACK when I get my hands in an elastic cluster or github repo that carries it.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: