Pyinfra – automate infrastructure super fast at scale

Fizzadar · on June 11, 2020

I wrote pyinfra! If anyone has questions/suggestions/issues I'm more than happy to answer :)

BiteCode_dev · on June 11, 2020

I'm looking for an alternative to ansible and fabric 2, some kind of middleground, and pyinfra looks like something I want to give a try to.

Is there any way to attach the deployement scenario in a Python object ?

All the examples I see are global functions called at a root of a module.

How do I make a scenario pluggable? Reusable? Introspectable?

Fizzadar · on June 11, 2020

> Is there any way to attach the deployement scenario in a Python object ?

It is possible to use pyinfra as a Python API, but this is not currently officially supported/may not follow semver, example: https://github.com/Fizzadar/pyinfra/blob/master/examples/api....

> How do I make a scenario pluggable? Reusable? Introspectable?

pyinfra comes with builtin support for packaging "deploys" as Python packages, see: https://pyinfra.readthedocs.io/en/v0.14.5/api/deploys.html, an example: https://github.com/Fizzadar/pyinfra-docker.

BiteCode_dev · on June 11, 2020

Thanks. I like the fact you can do a quick and dirty script, but the wrap it up in tasks if you want.

pydry · on June 11, 2020

Can/will it be adapted to handle provisioning in cloud environments? - e.g. set up EC2 instances or is it your expectation that it will work with something that will handle that part?

Fizzadar · on June 11, 2020

So pyinfra itself won't do this - but because inventory/etc is written in Python it's possible to easily achieve this by using boto3 directly in the inventory file.

The use of Python for the inventory + deploy code makes it possible to integrate pyinfra with almost any external tool, without specific support in pyinfra (that's the theory, at least!).

1337shadow · on June 11, 2020

Looking great !

How is it executing remote commands ?

Is it uploading a script like ansible, or just executing remote commands ? The latter seems to be the case glancing at the source code, nice one !

Fizzadar · on June 11, 2020

It executes commands remotely, so no python/etc needed on the targets, just a shell!

1337shadow · on June 11, 2020

What a huge improvement to Ansible !

Seems like a much more modern Ansible to me, more precise but more targeting Python users instead of trying to attract non-coders too, it's very promising. I wish I had found your project before I started my own too, it's shitty compared to yours but i'll swallow my shame and share it with you just in case you find an idea that you like yourlabs.io/oss/shlax

tapoxi · on June 11, 2020

How would you compare pyinfra to Salt?

Fizzadar · on June 11, 2020

I've not used salt very much at all but I believe both tools provide similar functionality. pyinfra is only agentless vs. salt can do both which makes it more scalable at huge numbers of hosts. The major difference is the configuration in Python vs YAML, which I personally think is a big advantage (but can see arguments the other way too)!

usrme · on June 11, 2020

How would you compare this to something like Fabric?

usrme · on June 11, 2020

Sorry if this seems willfully ignorant, but what advantage does this provide over just using something like Ansible?

BiteCode_dev · on June 11, 2020

Fair question.

Ansible is a great system, with a terrible DSL.

YAML is not a programming language, and deployment is not a toy task. You need power.

The result is that ansible has pushed a markup language to its limit, then pushed the templating engine used in it to its limit too. Then used a bunch of duck tape + conventions to glue everything and ended up with 10% of an unclean verbose unexpressive real programming language with ridiculous limited tooling, testability and had to document + support it.

As with most DSL.

For what? For the quest of having something "declarative", and that any language could read.

Well you can have declarative API written in any language, and nobody except ansible ever read playbooks.

In the end, it's just a big weight at the ankle. I never ever used it and though, "damn, what a pleasant tech choice, I'm happy they didn't directly expose the API through Python".

It's the same reason I use "nox" and not "tox", or "doit" and not "make". DSL really seem like a great idea, but they fail most of the time. There is a reason only few of them - like SQL, CSS or regex patterns - became a success. 99% of the time, what you need is a good lib, with a well designed API. For the 0.9% of the time you do need multi-language communication, you may implement RPC. Then, only for the 0.01% case should you really consider a DSL.

But we geek love to create DSL. They are fun to write! They are so elegant in tutorials!

Is Pyinfra doing better? I don't know, but I'm sure to give it a try. I really, really want to leave ansible behind, but fabric 2 is not high level or declarative enough.

eradman · on June 23, 2020

Well said! This is largely the motivation for frameworks that give you only what you need to write your own automation:

https://github.com/eradman/rset

https://github.com/rollcat/judo

sly010 · on June 11, 2020

Ansible's pretend-declarative yaml doesn't add any value, while forcing you to use patterns that rarely match your use-case. It should have been a python library without all the yaml from the beginning.

That said, pyinfra seems to be making the same mistakes (being a tool instead of a library, prescribing a folder structure, not letting me create my own abstractions), so you are right that it provides no advantage over Ansible.

BiteCode_dev · on June 11, 2020

While it would be nice to have the core and the framework dissociated, stating that it provide no advantage over Ansible seems overly broad.

You can use python logic to compose your deployment scenario, use imports to reuse features, package modules the same way, etc. The whole ecosystem is also at your disposal, be it libraries, IDE support, linters, formatters, debuggers and so on.

It doesn't need to reinvent the wheel, and you don't need to learn a new syntax.

I would call that a win.

Now, is pyinfra well designed enough to be practical in production, that's another matter that needs testing.

1337shadow · on June 11, 2020

Well then maybe this is more for you yourlabs.io/oss/shlax (it's still in development, check the pipeline build job to see some outputs)

But honestly pyinfra seems like the clean Ansible rewrite for people who like both devops and python programing, pyinfra looks like the next major version of Ansible.

BiteCode_dev · on June 11, 2020

Thanks for sharing Shlax, I didn't know about it, and at first glance, I like the design choices: encapsulate task, pass the target object so you can mock it for testing, use await to delegate I/O...

1337shadow · on June 11, 2020

Shlax still has to prove it can do what it aims with the simple design it strives to keep.

dec0dedab0de · on June 11, 2020

At a glance it seems more comparable to fabric than Ansible. Ansible is not really about controlling computers, it's about declarative configuration.

One major benefit over ansible is that you only need a posix shell on the remote side, instead of a compatible version of python and possibly some specific libraries. Which is a also a major downside if you want to manage systems that don't have a posix shell.

Fizzadar · on June 11, 2020

Very much this - pyinfra was heavily inspired by both Fabric & Ansible. Beyond POSIX, pyinfra now supports winrm (experimental) - https://pyinfra.readthedocs.io/en/v0.14.5/connectors.html#wi....

drcongo · on June 11, 2020

For me the answer is debuggability. Debugging Ansible can be incredibly painful at times.

usrme · on June 11, 2020

I guess I haven't used Ansible long enough to know such pain points. Would you elaborate with some examples?

0xbadcafebee · on June 11, 2020

This is a bit like asking what the Marquis de Sade did that was so painful.

I don't even know where to begin explaining how frustrating Ansible is. Over time they've compounded one bad design decision on top of another, while never adding any actually useful functionality to be able to troubleshoot or iterate on the many random failures. They also don't provide guidance on how "if you use feature X, features a, b, c, d, and e will not work correctly". Finding a working example of core functionality, like an AWS inventory plugin, requires you to dig through all of the code and scour the internet and experiment for a day before you have a simple working configuration. The more features you use, the more fragile and shitty it becomes, to the point that nobody wants to actually change any roles or playbooks because deployments might stop working and it'll take you two days to figure out how to make them work again in Ansible's bass-ackwards design. And of course, it's Python, so you have to teach all your users how to use virtualenvs, freeze deps, and run it with Docker, or you'll constantly hear "it didn't run correctly for me".

godtoldmetodoit · on June 11, 2020

I've been neck deep in Ansible for the past couple of months, and boy, I am not a fan.

Documentation is middling at best imo and it just feels hacked together.

pampa · on June 11, 2020

It is easy to pick up and do easy tasks. Like installing a package on ubuntu and setup ufw. Then you throw in a few centos hosts and have to write if-else in yaml. Then u try to provision docker and end up with more if-else blocks to install the required python packages. Then you get to the part where you want to provision a database with presistent storage inside a docker container and just give up.

isbvhodnvemrwvn · on June 12, 2020

The two things that came into my mind when I first used Ansible was "bash" and "hudson/jenkins".

It seems like a tool which was written by an ops person (as in hacked together) for ops people (who don't mind stuff being hacked together), specifically for replacing manually distributed cryptic bash files. It's an improvement over that, for sure, but that's it.

lixtra · on June 11, 2020

> run it with Docker

Isn’t the point of Docker that not everyone has to know the virtual envs, etc?

0xbadcafebee · on June 12, 2020

Yes, if you freeze deps you can just install to the system paths in a Docker container. But most people who I've distributed Dockerized ansible git repos to don't read the README and don't try to run it in Docker, so they flail around trying to get it running without a venv or anything, until I finally convince them to just use the container (they're not familiar with containers either so it's almost as much of a slog).

I hate technology.

dec0dedab0de · on June 11, 2020

Ansible error messages have gotten better, but there are still plenty of times where it will take a configuration error such as a missing module argument and report on it as a syntax error.

And then when you get those errors you need to understand what the object is and now you're running the debug on a ton of variables, some of which you first need to set facts on, then others you don't know what are available.

IMO the biggest problem with Ansible is that it feels like it was designed with a really good idea in mind, and then extended and modified by a bunch of different teams who had their own ideas. Which is fine if you've been paying attention, but explaining it all to someone new can be overwhelming.

1337shadow · on June 11, 2020

For example, executing a long command: you will not see any output until the command exits.

isbvhodnvemrwvn · on June 12, 2020

And it's a fundamental flaw in the architecture, it can't be easily fixed.

pydry · on June 11, 2020

ansible's declarative manifests got pretty complicated and even (potentially, not sure if it actually happened) achieved accidental turing completeness.

if your DSL gets to that point sometimes it's better to rewrite as a library.

1337shadow · on June 11, 2020

which pyinfra seems to achieve perfectly

sandGorgon · on June 11, 2020

have you looked at Pulumi ?

https://www.pulumi.com/docs/reference/pkg/python/pulumi/

https://www.pulumi.com/blog/programming-the-cloud-with-pytho...

BiteCode_dev · on June 11, 2020

It seems more complementary than a competition, as Pulumi targets clouds, while pyinfra has a lot of tools for self hosting.

ghostwriter · on June 11, 2020

What advantages does Pyinfra have over NixOps?

https://github.com/NixOS/nixops

jackhalford · on June 11, 2020

> NixOps is a tool for deploying to NixOS machines in a network or the cloud

seems like pyinfra is OS agnostic.

zoom6628 · on June 11, 2020

Just checked out the NixOps manual https://nixos.org/nixops/manual/ and it seems able to deploy to NixOS machines on any cloud or LAN platform. But i also do not find any mention of it being able to deploy non-NixOS servers.

ghostwriter · on June 11, 2020

The question is why would you want to have non-nixos servers in that case. If the goal is to have super-fast automation, you still need to unify your setup around something common. I see more benefits in unifying around NixOS than in unifying around Ansible/Pyinfra, as the latter require me to specify my infra in terms of low-level OS- and distro-specific package managers, that would eventually define the same lack of "OS agnostic portability".

ghostwriter · on June 11, 2020

The frontpage shows that it relies at least on apt-get:

    from pyinfra.modules import apt

    apt.packages(
        {'Install iftop'},
        'iftop',
        sudo=True,
    )

And search over docs doesn't produce results for Windows/MacOS

* https://pyinfra.readthedocs.io/en/v0.14.5/search.html?q=wind...

* https://pyinfra.readthedocs.io/en/v0.14.5/search.html?q=maco...

detaro · on June 11, 2020

it supporting apt doesn't mean it "relies on apt". Looking at the list of options, it supports multiple package managers, including ones for Windows and MacOS.

ghostwriter · on June 11, 2020

but then, if one needs to use low-level primitives representing specific package managers, it's hardly aligned with a claim of "super fast infrastructure automation". With Nix I have exactly one package manager for any target platform.

totetsu · on June 11, 2020

isn't there another python writable iac project?

eliaspro · on June 11, 2020

- Pulumi [1], which supports amongst many other languages Python as well

- SaltStack [2], which has a broad range of execution and state modules, uses by default a "Master/Minion" architecture, but can be used push-based through "salt-ssh" [3] as well

- POP/Idem [4] - which originates in the concept of idempotent SaltStack states, but exposes this functionality as Python code and uses the POP paradigm [5] coined by SaltStack's founder Thomas Hatch. A lot of SaltStack itself will quite likely move towards this architecture in the foreseeable future as well

[1] https://www.pulumi.com/docs/ [2] https://github.com/saltstack/salt/ [3] https://docs.saltstack.com/en/latest/topics/ssh/index.html [4] https://gitlab.com/saltstack/pop/idem [5] https://pop.readthedocs.io

totetsu · on June 12, 2020

Thank you

drcongo · on June 11, 2020

There was one by the creator of Ansible but sadly it never reached production status.

fermigier · on June 11, 2020

https://github.com/opsmop/opsmop for the record

drcongo · on June 11, 2020

Thank you, my ageing brain just couldn't surface the name.

mdellavo · on June 11, 2020

saltstack, fabric

whalesalad · on June 11, 2020

This sounds like Capistrano rewritten in Python?

eeZah7Ux · on June 11, 2020

How do you prevent human error from breaking production?

nerdponx · on June 11, 2020

"Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?"

1337shadow · on June 11, 2020

Play your stuff on a staging server first, and if possible on a container or vm in the test pipeline of your infra code.

And well, start with a backup and be careful when you're working in production but that's common sense right ?

eeZah7Ux · on June 11, 2020

Staging works well in my experience. How can you benefit from a test pipeline and a staging system with tools that bypass it and ssh right into the production systems?

VWWHFSfQ · on June 11, 2020

what kind of human error

eeZah7Ux · on June 11, 2020

Any error in any command or configuration file that should never be run on production.

peteradio · on June 11, 2020

Keep a menacing hammer in the corner of the room. Walk in once a week and fondle it with a viscous grin just waiting and daring someone to make you use it.

knodi · on June 11, 2020

ya fix pip and pipenv first. Not starting any more new python projects.

BiteCode_dev · on June 11, 2020

The author of Pyinfra is just a dev like you and me working on a side project.

There is no relationship between pyinfra, pipenv and pip.

There is not even a relationship between pip and pipenv, they are completly different teams.

In fact, none of those projects are maintained by the Python core team or part of the Python project. Not even pip, which is separated from Python and provided using get-pip.py at install.

throwaway894345 · on June 11, 2020

And deployment and crosscompilation (for native runtime dependencies) and performance, although those are probably not significant issues for this application.

DrJones1098 · on June 11, 2020

Whats your issue with it? Asking as a noob to python package management. I've always used venv and pip and didn't even know pipenv existed until I read your post. Should it be avoided?

drcongo · on June 11, 2020

Personally I'd recommend Poetry [1] at this point for Python package management. 99% of the time it works perfectly and keeps out of your way. There's also a lot of good info in the comments here [2]

[1] https://python-poetry.org

[2] https://news.ycombinator.com/item?id=23380113