With Pandas, typically the important data types are the series data type for col...

rattray · on April 1, 2019

Sorry, I meant static types, like mypy

goatlover · on April 1, 2019

Right, my point was that static typing doesn't seem like a good fit for a library like Pandas, which needs to be flexible enough to handle a wide range of use cases for tabular data where you're cleaning, reshaping, etc. At least not without a total rewrite.

With Flow in JS, how would a JSON library for similar purposes work where you don't know what kind of data it will be and you will be doing a lot of transformations on it?

pushtheenvelope · on April 2, 2019

I think designing for static types changes the design of a library’s api. So, in this case, Pandas would likely have to evolve (handwave) in some fashion, if it were to support Mypy types.

One way Mypy could help do this is by implementing a feature like Type Providers https://docs.microsoft.com/en-us/dotnet/fsharp/tutorials/typ...

rattray · on April 2, 2019

If you _really_ don't know what type of data you have, you type it as `mixed`. But if you _do_ know the shape of your data, you write your own types, and tell the library about those types (eg; with generics). Think like `my_df: DataFrame<MyDataShape> = pandas.DataFrame(my_data)` where you manually define the shape of `MyDataShape`.

Scala is a language that is often used for data processing, and is statically typed.

Depending on the situation, codegen can also be useful in these situations, eg; https://github.com/typeorm/typeorm

EDIT: in any case, this certainly answers my question :-) sounds like they're not there yet, and perhaps not even moving in that direction yet.