Even the Pylint codebase uses Ruff

muglug · on March 6, 2023

Context: Ruff is a Rust-based linter for Python. It’s much faster than Python-based linters for Python because Rust is much faster than Python.

There are a couple of drawbacks to Ruff that mean it may not be right for everyone: since it’s a compiled tool, it’s harder to add custom rules — you need to fork the project to add your own, and also you need to know some Rust.

When checking the Pylint codebase it makes sense to use Ruff as well as Pylint — it’s a good sanity check that Pylint’s own tooling is not letting issues through that another tool is catching.

charliermarsh · on March 6, 2023

Yeah Ruff and Pylint can be used in a complementary way. Pylint implements rules that Ruff does not, and vice versa. I think Ruff might actually support more _total_ rules than Pylint at this point (not a great metric), but Pylint does more cross-file analysis and type inference. [1]

Some teams use Ruff alongside Pylint, some have replaced Pylint with Ruff entirely (Dagster is a good example [2]), especially those already using a type checker alongside their linter.

[1] https://beta.ruff.rs/docs/faq/#how-does-ruff-compare-to-pyli...

[2] https://twitter.com/schrockn/status/1612615862904827904

epage · on March 6, 2023

When you are already using flake8/ruff and mypy, there is little room left for pylint in my opinion. It might have some extra checks but they aren't enough to justify the performance hit.

munch117 · on March 6, 2023

I use a script that identifies modified files in my working copies, runs pyflakes on them, and after that pylint -E.

By the time pylint finishes, I'll have already fixed most of the low-hanging fruit that pyflakes found, and then I look through the pylint messages to see if there's anything additional.

There's very little waiting for pylint that way.

charliermarsh · on March 6, 2023

By the way: epage, you are a hero. I am very grateful for everything you build.

bjourne · on March 6, 2023

How many thousands of lines of code do you need to lint before Rust's performance advantages actually matter? After all, performance optimizing a tool where N is too small to matter is the root of all evil. Keep in mind that you don't have to lint an already linted file if you cache its hash.

bastawhiz · on March 6, 2023

I honestly thought the same thing. I used flake8 and it finished in about a second on my codebase. But when you have that on a git hook, you'd be surprised how the small amount of latency affects your productivity. I have been using ruff for about two months now, and I'm still thinking "Wow that was fast" every time I create a commit.

avgcorrection · on March 6, 2023

Linting is something that people might do on every commit these days. Commits that take more than 150ms to complete are annoying. And also on every CI run. Oh well every CI run doesn’t sound bad. Except you’re also running 30 other things in order to check 30 other “should always work/be the case” invariants.

EDIT: Missed your point about caching already-linted file. In that case maybe performance doesn’t matter.

> After all, performance optimizing a tool where N is too small to matter is the root of all evil.

Picking a third-party tool that does what it does and happens to be relatively fast. Where’s the evil in that?

minraws · on March 6, 2023

> Keep in mind that you don't have to lint an already linted file if you cache its hash.

yeah, if only that was the case. Even if the file hasn't changed if it's dependencies have changed you need to lint again. And doing dependency analysis on most weirdly coupled codebases out there, is not trivial nor accurate tbh.

And not running a good linter and letting a bug through is the most shit feeling you will have in your life. Trust me on that.

muglug · on March 6, 2023

It depends on the type of linting being done.

For linters that perform type inference, fanout means having to relint files that might not have changed. But many linters don’t need the full graph — for example, file formatting problems aren’t subject to fanout.

bjourne · on March 6, 2023

What linter functionality requires global analysis?

OJFord · on March 6, 2023

We have a CI pipeline more complicated than it needs to be (=> also engineering time setting that up, maintaining it) because pylint was too slow.

Not a snowflake, just a Python web app.

brianwawok · on March 6, 2023

My note huge codebase takes 5s for flake8

The max we have ever had is 3 developers at a time, so if we were a 100 or 1000 person company, I bet it's a bit of time.

vitorsr · on March 6, 2023

To add to your comment:

I also think that these rewrites are throwing the baby out with the bathwater. Especially in projects related to code quality and analysis, things like widespread adoption and issues going back almost a decade are much more important than raw performance.

For every aspect of functionality, there is a heavy baggage of quirks, bugs, edge cases etc. I am not convinced that a rewrite of a tool that is still in development and still limited to a subset of functionality is able to capture nearly a decade of community work.

nindalf · on March 6, 2023

The author of Ruff mentioned that many people told him exactly what you're saying. He went ahead and built it anyway. I'm puzzled though. Those naysayers were naysaying about an attempt at rewriting. You appear to be naysaying a successfully built project.

It's true that the folks who find it most helpful are the ones with large Python projects. But these are sometimes very important, widely used projects. See the testimonials from users (https://beta.ruff.rs/docs/#testimonials)

> widespread adoption

This makes little sense in the context of a linter. As long as my linter implements all the popular rules and my code is kept tidy, what's the issue? What does it matter if there's widespread adoption or not?

> still limited to a subset of functionality

You have a link to some functionality you're missing and necessary for your workflow? Or is this more of a theoretical concern?

Sorry if this comes across as rude.

vitorsr · on March 6, 2023

I am happy if this tool works for you but I think writing off my comment as naysaying is not constructive.

There is no support for all syntax features of the current language specification and reference implementation. There is also no legacy support. Again, especially in projects concerned with code quality and analysis, building is not more important than long-term maintenance: Both language specification and reference implementation are constantly evolving. The current approach relies on an upstream interpreter and thereby inherits idiosyncratic divergences and limitations.

There is no extensibility. The majority of projects are neither public nor frameworks. Often projects employ a number of frameworks each with associated dialects, with both frameworks and dialects also constantly evolving. The current approach to extensibility is approximately one person manually rewriting rules individually from heterogeneous projects that span several contributors over several years. There is already outstanding maintenance work on manually introduced rules, only a subset of which gets flagged by a fraction of adopters. There is even outstanding work on feature parity.

coldtea · on March 6, 2023

>There are a couple of drawbacks to Ruff that mean it may not be right for everyone: since it’s a compiled tool, it’s harder to add custom rules — you need to fork the project to add your own, and also you need to know some Rust.

That might not be unsurmisable, given that most don't write their own rules (just combine, configure, enable/disable existing ones), and that in could be possible to add to Ruff a way to write rules in a DSL (without knowing Rust itself, and without the need recompile).

codetrotter · on March 6, 2023

> add to Ruff a way to write rules in a DSL

But if you start going down that route, isn’t it likely that you’d end up making Ruff into a full-blown interpreter?

Or that you end up embedding an interpreter into Ruff. Such as for example.. Python.

At either of which point, the motivation to use Ruff instead of e.g. Pylint might begin to disappear?

coldtea · on March 6, 2023

>But if you start going down that route, isn’t it likely that you’d end up making Ruff into a full-blown interpreter?

Not any more than with any other DSL. You can always stop at a subset enough to handle expressing linting rules. Or expressing many linting rules. Doesn't need to be able to express every possible rule -- which is the reason you'd make it a full blown interpeter. Pareto and all.

Even if that's not possible (and you need turing completness to handle custom rules), then:

>At either of which point, the motivation to use Ruff instead of e.g. Pylint might begin to disappear?

Well, Ruff would still have crazily faster parsing and built-in rules. It's not like you'd be forced to use custom interpreted rules.

And even in that case, a purpose-specific built interpeter, even if general, could be way faster (because it's optimized for a specific task and doesn't need to carry garbage as the GIL and other CPython decisions), and with better task-focused primitives, compared to embedding CPython.

Yoric · on March 6, 2023

Given how easy it is to write Rust code that calls Python or Python code that calls Rust, it might even be possible to write Ruff rules in Python.

oblio · on March 6, 2023

Do you happen to have handy a good article about these 2 way calls?

KptMarchewa · on March 6, 2023

There is something better than article: documentation. https://pyo3.rs/v0.18.1/

avgcorrection · on March 6, 2023

Just AOT compile the DSL? :)

lost_tourist · on March 6, 2023

I mean while it's harder to support dynamic stuff in rust than say in python, it's certainly not impossible to add dynamic rules in principle.

muglug · on March 6, 2023

You can add dynamic rules, but they either need to be compiled alongside the tool or written in a format that the tool can parse. That interchange language can be anything from plaintext to WASM bytecode.

I’ve used the former system (compiling things together in a custom build) in a Rust-based static analysis tool I’ve created. SWC uses the latter system, requiring plugins to be authored in WASM.

pjmlp · on March 6, 2023

How much Python code would remain written in Python, if caring about a proper JIT in CPython was actually a thing, one wonders.

kibwen · on March 6, 2023

JITs shine on long-running tasks, not on jobs that are expected to begin and end in the blink of an eye.

pjmlp · on March 6, 2023

JIT caches survive between execution runs.

Modern Java runtimes, .NET, Julia, and Android all make use of them.

froh · on March 6, 2023

do you happen to know why ruff is not extensible in Python?

benrutter · on March 6, 2023

It's a design choice by its creator- funny enough I was listening to an episode of 'talk python to me' with the creator yesterday and he talked about exactly that.

The main reason given was that each plugin has to reimplement a bunch of the same stuff slightly differently (like finding public methods eg) and keeping it all internal avoids this and keeps things speedy

toxik · on March 6, 2023

Because it is written in Rust, I presume.

coldtea · on March 6, 2023

One of Python's use cases is as an embedded extention language. Used in tons of apps that way...

pjmlp · on March 6, 2023

Pity it wasn't originally written in Python. /s

coldtea · on March 6, 2023

Well, originally others were.

Then somebody did it in Rust, where it's faster.

There's still a tradeoff: not as dynamic.

/s

oreilles · on March 6, 2023

Sublime Text and QGis are both written in C++ and extensible in python.

rtpg · on March 6, 2023

I had started writing a 1-for-1 Pylint port in Rust (almost nowhere right now), but Ruff has gotten there way faster. Props!

Pylint is rough to make faster, party from the existing structure, but honestly at one point you are just paying a huge cost for traversing wide ASTs and doing all this work, and then paying extra for the Python object model.

In my profiling of Pylint, I've found some fixes, and I am curious about what something like slots or mypyc could do to it... but I'm also very partial to just having a system language implementation.

People talk about ease of contributing as a big advantage for these kinds of tools to be written in the language they are working on. Personally I've found that people who like messing with tools like the idea of learning languages to contribute.

My one kind of reservation about Ruff is that the code base is designed as "run each checker as its own function". My suspicion is that this leads to a performance ceilings as each checker is traversing the AST looking for its flavor of thing. Visitor patterns (theoretically!) allow for us to run checkers in parallel nicely. Granted the Ruff pattern allows for parallel work as well (and you can share all of the AST data between threads anyways), so maybe its "right". It definitely looks easier to maintain!

drothlis · on March 6, 2023

Could you implement (some of) astroid's inference using stack graphs? [1],[2]

That would allow a lot of caching optimisations, as you can "index" each file in isolation.

[1]: https://github.blog/2021-12-09-introducing-stack-graphs/

[2]: https://github.com/github/stack-graphs

Rauchg · on March 6, 2023

Very excited to see the Rust-ification of Python tooling, similar to what we’re seeing in the JavaScript[1] tooling ecosystem.

[1] https://leerob.io/blog/rust

izik · on March 6, 2023

It is already happening and with the availability of the tools like PyO3 [1] (it allows Python developers to write performance-critical code in Rust and then expose it as a Python module) the Rustication of the Python will be progressing while some controversies will appear [2]. Adding new language adds complexity to the python code base but it is always good to have at least choice and decide case-by-case if use Rust in Python app or not.

[1] https://github.com/PyO3/pyo3

[2] https://safjan.com/rustification-of-python/#drawbacks-and-co...

Rauchg · on March 7, 2023

Nice references, thanks for sharing!

wiz21c · on March 6, 2023

waiting for rusthon.

KptMarchewa · on March 6, 2023

It exists: https://github.com/RustPython/RustPython

zzzeek · on March 6, 2023

just tried it and this is immensely fast.

too bad it doesn't appear to support flake8 plugins and only has a small subset hardcoded into it. if it could support arbitrary flake8 plugins I would move SQLAlchemy to this tool immediately.

the main reason people want to move off flake8 is not so much a speed issue but that its author refuses to allow it to use pyproject.toml files for configuration, meaning every project's configuration has to remain half-broken using obsolete setup.cfg files. A very silly situation.

charliermarsh · on March 6, 2023

Yeah we don't support arbitrary Flake8 plugins. We do plan to support plugins eventually but the exact architecture and API are TBD.

If there are plugins that would be particularly impactful for you, I'd be happy to take a look at adding them. (We support some of the plugins you use in SQLAlchemy, like flake8-docstrings and flake8-builtins, but not flake8-rst-docstrings which I'm guessing is a blocker.)

zzzeek · on March 6, 2023

thanks for the reply! we use a bunch of goofy ones that nobody else does. one of them is like a 3 line plugin that has only a handful of stars on github.

      flake8-import-order
      flake8-builtins
      flake8-future-annotations
      flake8-docstrings
      flake8-rst-docstrings
      flake8-import-single

charliermarsh · on March 6, 2023

Oh heh yeah I saw flake8-import-single :)

That might be equivalent to using Ruff's isort implementation with `force_single_line` but maybe that comes with other implications that you don't want for SQLAlchemy.

zzzeek · on March 6, 2023

I really tried to use isort but it doesnt do the import order style that we use, and there were some other things that didnt work for us. flake8-import-order has selectable styles that I match in my own "zimports" tool that we use to actually write our imports. flake8-import-single then adds this extra idea that you shouldn't have "from tool import x, y, z", since that defeats the purpose import sorting which is to avoid merge artifacts.

clepto · on March 6, 2023

Are smaller plugins like this potentially an easy place for first time contributors to work in?

My Rust skills are certainly very lacking but if there is some big list of arbitrary flake8 plugins that there is a desire to have ported, that sounds like it could be a relatively easy place to get one’s feet wet.

clepto · on March 6, 2023

I maintain a few open source libraries and we have recently switched to Ruff from Flake8 exclusively because of the lack of pyproject.toml support.

These are relatively speaking small codebases, and the speed improvements really don’t make a difference for us, but with trying to get our own projects in line with current standards flake8’s refusal to support it was the final nail in the coffin.

I think the point was reasonable when pyproject.toml first came about and parsing it was a bit more of a Wild West, I can certainly understand the unwillingness to try and support it then. Now with Python itself having toml support in the standard library, and every other major tool standardizing on it, not to mention TOML being the de-facto config file for Rust projects. It just seems like a bizarre hill to die on to me now.

It’s not as if the work hasn’t been done, there are forks of flake8 with pyproject.toml support, but as far I’m aware multiple PRs with support for it have been denied.

emptysea · on March 6, 2023

yeah that pyproject.toml issue is really annoying with flake8, made me glad to switch to ruff even though it doesn't have all the same lints available

selestify · on March 6, 2023

What’s the reasoning behind refusing to use pyproject.toml?

zzzeek · on March 6, 2023

it's utterly ridiculous and has all the indications of a maintainer who was grumpy about making a change (a place I know *very well*, I am there multiple times a week myself), and now as it's gone on for years, they no longer can really back down from their original "complaints" even though they are absurd at this point.

fastball · on March 6, 2023

The good news is some maintainers can have a change of heart. I seem to remember there being a certain someone who thought asyncio python was pointless, but now SQLAlchemy has asyncio support and that's super cool!

zzzeek · on March 8, 2023

haha that's because i figured out a cool way to do it that wouldnt ruin my life.

pyproject thing, he just doesn't want to "give in", there's zero technical issue

shosca · on March 6, 2023

https://github.com/PyCQA/flake8/issues/234

AgentOrange1234 · on March 6, 2023

As a member of the uninitiated, it would be great if these kind of posts could include some context on what Ruff is and why this is significant?

jdxcode · on March 6, 2023

I've never understood why HN has never included even a 80 character description. I think it would really help even if it was only shown on the comments tab.

oefrha · on March 6, 2023

It’s clearly explained in the FAQ:

> How do I make a link in a text submission?

> You can't. This is to prevent people from submitting a link with their comments in a privileged position at the top of the page. If you want to submit a link with comments, just submit it, then add a regular comment.

If you want to add more context, add it in a comment. If it’s good people will upvote it.

n2d4 · on March 6, 2023

One reason I could think of is because it encourages informative titles - I enjoy being able to click directly from home page to article, without having to look at the comments. A lower percentage of posters would explain Ruff in the title if there was a description. (Clearly it's currently not 100% either, that's why I'm here.)

Though this is something that keeps coming up on Reddit as well, I assume there's a reason why HN went with this design.

camtarn · on March 6, 2023

This conflicts with the HN guideline of not editorialising titles, which leads to titles often having little or zero context.

pjacotg · on March 6, 2023

If you enjoy podcasts this episode discusses ruff: https://talkpython.fm/episodes/show/400/ruff-the-fast-rust-b...

rodrigobellusci · on March 6, 2023

It's a python linter, from what I've seen.

qbasic_forever · on March 6, 2023

Implemented in rust, not python, so it's extremely fast. It's orders of magnitude faster than pylint itself.

FreakLegion · on March 6, 2023

This would matter more if it did all the things Pylint does, but Ruff still has a ways to go. They're tracking progress in https://github.com/charliermarsh/ruff/issues/970.

rodrigobellusci · on March 6, 2023

Damm that's a lot of cases to cover. I wonder how performant will Ruff be once it reaches feature parity.

furyofantares · on March 6, 2023

How many orders of magnitude?

qbasic_forever · on March 6, 2023

There's a very clear graph the first thing on ruff's GitHub page, easy to find and research yourself: https://github.com/charliermarsh/ruff Can't miss it if you take time to check out the project.

furyofantares · on March 6, 2023

So 2, maybe 3

p5a0u9l · on March 6, 2023

Your comment seems skeptical. Practically, It’s a night and day difference if you’ve integrated a linter with your editor. Plus, ruff is able to fix as well as lint.

furyofantares · on March 6, 2023

Didn't mean it that way, something being 100+ times slower is a LOT. And I care a lot about tool speed. I just didn't really think I needed to be shamed for asking a question they had a ready answer for.

mkl · on March 6, 2023

Ruff is another Python linter, written in Rust. https://github.com/charliermarsh/ruff

earthboundkid · on March 6, 2023

Who lints the linters?

speed_spread · on March 6, 2023

It's linters all the way down!

Ar-Curunir · on March 6, 2023

as per this thread, other linters =P

astrea · on March 6, 2023

The dryer

SuperSandro2000 · on March 6, 2023

See https://github.com/charliermarsh/ruff/issues/970 for a comparison between pylint and ruff

gempir · on March 6, 2023

There are a lot unchecked boxes in that list. But I'm looking forward to the progress of this project.

galoisscobi · on March 6, 2023

I was excited to start using Ruff however it has quite a few ruff edges for me to use professionally. Back to pylint for now, but looking forward to adopting it once it's ready. The difference in speed was night and day.

Another feature I hope to see in Ruff is to "score" the code. With pylint, I can have a CI rule to make the lint check pass if it scores 8/10 or above, or even a lower score so that co-workers who resent using linters can start using them.

peefy · on March 6, 2023

I'm glad to see the organic combination of Python and Rust. My personal interesting experience is to rewrite a Python project using Rust [1], which has increased the speed by 40 times. Then the original Python project and the rewritten Rust code are organically combined through PyO3, which is very good. We have not only achieved performance, but also achieved ease of use and scalability. Just like PyLint and Ruff.

[1] We Rewrote Our Project With Rust… and It’s Almost 40X Faster. https://medium.com/better-programming/40x-faster-we-rewrote-...

brigandish · on March 6, 2023

I clicked through, did a search for Ruff, then looked up the Github project. Okay, but… so what? Is this another case of "I rebuilt X in Rust and hit the front page of HN"?

It seems so.

beatthatflight · on March 6, 2023

Because of the performance.

See example graph: https://user-images.githubusercontent.com/1309177/212613422-...

For those of us who use pylint and flake8, this is a heaven-sent tool!

charliermarsh · on March 6, 2023

Thank you for the kind words, I hope Ruff's working well for you

DanielVZ · on March 6, 2023

It’s a big deal because normally linters have been written in pure python and they are prohibitively slow for large code bases. This tool provides an important speedup that will save my team hour’s of waiting for CI.

fbdab103 · on March 6, 2023

I am curious how big your codebase is that it saves hours. I never found the existing Python based tooling to be slow enough that I felt the need to consider alternatives. Though, I will gladly take the speedup.

lost_tourist · on March 6, 2023

Any program that took longer than several minutes to "lint" for me seemed to blow up anyway, usually by running out of memory or hitting some infinite recursion.

scbrg · on March 6, 2023

https://xkcd.com/1205/ is relevant here.

Our codebase is somewhere between 200k and 300k LOC. Running the flake8 suite takes minutes. We haven't integrated ruff, but during a test run it ran in less than a second, IIRC.

So, with, say 20-30 commits per day, that's about an hour, every day. Sure, it's not like we're usually sitting there twiddling our thumbs during builds, but sometimes we actually do have to wait for a build to get ready. Just the waiting time during those rare occasions add up.

[edit]: flake8, remarkably, dropped roughly an order of magnitude in performance between version 2 and 3. So that didn't help.

brenns10 · on March 6, 2023

> [edit]: flake8, remarkably, dropped roughly an order of magnitude in performance between version 2 and 3. So that didn't help.

Sorry for the drive by, but did you try bisecting the flake8 change that slowed it down? Unless there were thousands of commits between releases, it would only take <10 runs to bisect, an automated bisect should be under an hour with a test case that takes a few minutes. Could be an interesting thing to try?

scbrg · on March 6, 2023

Doubt it would be helpful. It appears that the model changed dramatically between those two versions. For instance, the whole API (which we relied on) was dropped[0] and some legacy API was glued in place.

In either case, the information wouldn't be all that useful. We run our stuff off of Debian packages (yes, we're old school), and we're not particularly interested in maintaining our own fork.

[0] https://flake8.pycqa.org/en/latest/user/python-api.html#lega...

pydry · on March 6, 2023

We fixed that in a previous project by running the linter only on changed files.

brigandish · on March 6, 2023

I get that bit, what I don't get (or don't see a good reason for) is why Ruff isn't the link and this is. Link to the actual project if that's the subject. If someone wants to compare performance, write a blog post or something instead of wasting everyone's time pointing to references in a different project.

BiteCode_dev · on March 6, 2023

Is ruff a good replacement for pylint yet ? It replaces flake8 well but since I use black and pylint, I don't need flake8. However, pylint does a lot so if it does replace it, even without plugins, that would be amazing.

VeejayRampay · on March 6, 2023

ruff is exceptionally fast and capable, but doesn't handle all constructs, match/case are not handled correctly yet for example

BiteCode_dev · on March 6, 2023

Pylint value, unlike flake8, is not in checking formatting but rather naming conventions, code complexity, and basic code sanity.

Do you use pylint ? If yes what are the important pylint checks you missed in ruff?

deckiedan · on March 6, 2023

Match is now supported (as of last week) I believe

VeejayRampay · on March 6, 2023

wow nice

theusus · on March 6, 2023

I have the best experience with Pylance. And I am talking about editor extensions.

asah · on March 6, 2023

I wonder if ChatGPT could be used to port more rules from pylint to ruff ?

the_sleaze9 · on March 6, 2023

This honestly depends on whether you are familiar enough with the Ruff codebase and Rust syntax, no different than just "doing it by hand".

At this point (IMHO) ChatGPT does not make a meaningful difference in the production of software, aside from individual workflow preferences. The reason being ChatGPT will produce a lot of computer code which is very likely to be subtly incorrect in difficult to spot places.

andrewshadura · on March 6, 2023

Does Ruff support configuration in setup.cfg?