As far a I know, the term "data frame" comes from R and its predecessors like S,...

stewbrew · on Feb 16, 2016

"data frame is the core data structure"

Actually, AFAIK a data.frame in R is actually a list of vectors (i.e. columns) with some constraints.

chubot · on Feb 16, 2016

That's not true:

    > d=data.frame(a=c(1,2,3),b=c(4,5,6))
    > e=list(a=c(1,2,3),b=c(4,5,6))

    > class(d)
    [1] "data.frame"
    > class(e)
    [1] "list"

    > d[c(TRUE,FALSE),]
      a b
    1 1 4
    3 3 6

    > e[c(TRUE,FALSE),]
    Error in e[c(TRUE, FALSE), ] : incorrect number of dimensions

They are represented similarly in R, but they are distinct data types. The data frame is the core data structure in the sense that many functions in R operate on data frames (but not lists of vectors).

stewbrew · on Feb 16, 2016

1. You shouldn't use `=` for assignments in R but `<-`. `=` does late binding.

2. You shouldn't use `class()` here but `mode()` to check the actual underlying data structure.

    > mode(d)
    [1] "list"
    > mode(e)
    [1] "list"

3. The reason `[` works differently is because it a S3 method which invokes different functions for lists and data.frames -- that's why class(d) doesn't return "list". See `methods("[")`.

See https://cran.r-project.org/doc/manuals/r-release/R-lang.html... for details.

infinite8s · on Feb 16, 2016

Same in pandas, although it's closer to a dictionary of column names to singly-typed vectors. Most simple columnar databases are structured that way as well (a more complicated design involves chunking the columns into pages).