Vita
|
A 2-dimensional labeled data structure with columns of potentially different types. More...
#include <dataframe.h>
Classes | |
class | columns_info |
Information about the collection of columns (type, name, output index). More... | |
struct | example |
Stores a single element (row) of the dataset. More... | |
class | params |
Public Types | |
using | const_iterator = examples_t::const_iterator |
using | difference_type = examples_t::difference_type |
using | examples_t = std::vector< example > |
using | filter_hook_t = std::function< bool(record_t &)> |
A filter and transform function (returns true for records that should be loaded and, possibly, transform its input parameter). More... | |
using | iterator = examples_t::iterator |
using | record_t = std::vector< std::string > |
Raw input record. More... | |
using | value_type = examples_t::value_type |
Public Member Functions | |
iterator | begin () |
const_iterator | begin () const |
std::string | class_name (class_t) const |
class_t | classes () const |
void | clear () |
Removes all elements from the container. More... | |
dataframe () | |
New empty data instance. More... | |
dataframe (const std::filesystem::path &) | |
dataframe (const std::filesystem::path &, const params &) | |
New datafame instance containing the learning collection from a file. More... | |
dataframe (std::istream &) | |
dataframe (std::istream &, const params &) | |
New dataframe instance containing the learning collection from a stream. More... | |
bool | empty () const |
iterator | end () |
const_iterator | end () const |
iterator | erase (iterator, iterator) |
Removes specified elements from the dataframe. More... | |
value_type & | front () |
Returns a reference to the first element in the dataframe. More... | |
value_type | front () const |
Returns a constant reference to the first element in the dataframe. More... | |
bool | is_valid () const |
bool | operator! () const |
void | push_back (const example &) |
Appends the given element to the end of the active dataset. More... | |
std::size_t | read (const std::filesystem::path &) |
std::size_t | read (const std::filesystem::path &, const params &) |
Loads the content of a file into the active dataset. More... | |
std::size_t | read_csv (std::istream &) |
std::size_t | read_csv (std::istream &, params) |
Loads a CSV file into the active dataset. More... | |
std::size_t | read_xrff (std::istream &) |
std::size_t | read_xrff (std::istream &, const params &) |
Loads a XRFF file from a stream into the dataframe. More... | |
std::size_t | size () const |
unsigned | variables () const |
Public Attributes | |
columns_info | columns |
A 2-dimensional labeled data structure with columns of potentially different types.
You can think of it like a spreadsheet or SQL table.
Dataframe:
Definition at line 47 of file dataframe.h.
using vita::dataframe::const_iterator = examples_t::const_iterator |
Definition at line 121 of file dataframe.h.
using vita::dataframe::difference_type = examples_t::difference_type |
Definition at line 122 of file dataframe.h.
using vita::dataframe::examples_t = std::vector<example> |
Definition at line 55 of file dataframe.h.
using vita::dataframe::filter_hook_t = std::function<bool (record_t &)> |
A filter and transform function (returns true
for records that should be loaded and, possibly, transform its input parameter).
Definition at line 65 of file dataframe.h.
using vita::dataframe::iterator = examples_t::iterator |
Definition at line 120 of file dataframe.h.
using vita::dataframe::record_t = std::vector<std::string> |
Raw input record.
The ETL chain is:
FILE -> record_t -> example –(vita::push_back)--> vita::dataframe
Definition at line 61 of file dataframe.h.
using vita::dataframe::value_type = examples_t::value_type |
Definition at line 56 of file dataframe.h.
vita::dataframe::dataframe | ( | ) |
New empty data instance.
Definition at line 181 of file dataframe.cc.
|
explicit |
Definition at line 200 of file dataframe.cc.
vita::dataframe::dataframe | ( | std::istream & | is, |
const params & | p | ||
) |
New dataframe instance containing the learning collection from a stream.
[in] | is | input stream |
[in] | p | additional, optional, parameters (see params structure) |
Definition at line 194 of file dataframe.cc.
|
explicit |
Definition at line 217 of file dataframe.cc.
vita::dataframe::dataframe | ( | const std::filesystem::path & | fn, |
const params & | p | ||
) |
New datafame instance containing the learning collection from a file.
[in] | fn | name of the file containing the learning collection (CSV / XRFF format) |
[in] | p | additional, optional, parameters (see params structure) |
Definition at line 210 of file dataframe.cc.
dataframe::iterator vita::dataframe::begin | ( | ) |
Definition at line 235 of file dataframe.cc.
dataframe::const_iterator vita::dataframe::begin | ( | ) | const |
Definition at line 243 of file dataframe.cc.
std::string vita::dataframe::class_name | ( | class_t | i | ) | const |
[in] | i | the encoded (dataframe::encode()) value of a class |
i
(or an empty string if such class cannot be find) Definition at line 423 of file dataframe.cc.
class_t vita::dataframe::classes | ( | ) | const |
== 0
for a symbolic regression problem, > 1
for a classification problem) Definition at line 308 of file dataframe.cc.
void vita::dataframe::clear | ( | ) |
Removes all elements from the container.
Invalidates any references, pointers or iterators referring to contained examples. Any past-the-end iterators are also invalidated.
Leaves the associated metadata unchanged.
Definition at line 227 of file dataframe.cc.
bool vita::dataframe::empty | ( | ) | const |
true
if the dataframe is empty Definition at line 299 of file dataframe.cc.
dataframe::iterator vita::dataframe::end | ( | ) |
Definition at line 251 of file dataframe.cc.
dataframe::const_iterator vita::dataframe::end | ( | ) | const |
Definition at line 259 of file dataframe.cc.
dataframe::iterator vita::dataframe::erase | ( | iterator | first, |
iterator | last | ||
) |
Removes specified elements from the dataframe.
[in] | first | first element of the range |
[in] | last | end of the range |
Definition at line 772 of file dataframe.cc.
dataframe::value_type & vita::dataframe::front | ( | ) |
Returns a reference to the first element in the dataframe.
front
on an empty dataframe is undefined. Definition at line 283 of file dataframe.cc.
dataframe::value_type vita::dataframe::front | ( | ) | const |
Returns a constant reference to the first element in the dataframe.
front
on an empty dataframe is undefined. Definition at line 271 of file dataframe.cc.
bool vita::dataframe::is_valid | ( | ) | const |
true
if the object passes the internal consistency check Definition at line 780 of file dataframe.cc.
bool vita::dataframe::operator! | ( | ) | const |
true
if the current dataset is empty Definition at line 760 of file dataframe.cc.
void vita::dataframe::push_back | ( | const example & | e | ) |
Appends the given element to the end of the active dataset.
[in] | e | the value of the element to append |
Definition at line 332 of file dataframe.cc.
std::size_t vita::dataframe::read | ( | const std::filesystem::path & | fn | ) |
Definition at line 752 of file dataframe.cc.
std::size_t vita::dataframe::read | ( | const std::filesystem::path & | fn, |
const params & | p | ||
) |
Loads the content of a file into the active dataset.
[in] | fn | name of the file containing the data set (CSV / XRFF format) |
[in] | p | additional, optional, parameters (see params structure) |
std::invalid_argument | missing dataset file name |
Definition at line 742 of file dataframe.cc.
std::size_t vita::dataframe::read_csv | ( | std::istream & | from | ) |
Definition at line 726 of file dataframe.cc.
std::size_t vita::dataframe::read_csv | ( | std::istream & | from, |
params | p | ||
) |
Loads a CSV file into the active dataset.
[in] | from | the csv stream |
[in] | p | additional, optional, parameters (see params structure) |
exception::insufficient_data | empty / undersized data file |
General conventions:
\n
to be part of a csv field if the field is surrounded by quotes;Definition at line 677 of file dataframe.cc.
std::size_t vita::dataframe::read_xrff | ( | std::istream & | in | ) |
Definition at line 475 of file dataframe.cc.
std::size_t vita::dataframe::read_xrff | ( | std::istream & | in, |
const params & | p | ||
) |
Loads a XRFF file from a stream into the dataframe.
[in] | in | the xrff stream |
[in] | p | additional, optional, parameters (see params structure) |
0
in case of errors)exception::data_format | wrong data format for data file |
dataframe::read_xrff(tinyxml2::XMLDocument &)
for details. Definition at line 464 of file dataframe.cc.
std::size_t vita::dataframe::size | ( | ) | const |
Definition at line 291 of file dataframe.cc.
unsigned vita::dataframe::variables | ( | ) | const |
variables() + 1 == columns.size()
. Definition at line 319 of file dataframe.cc.
columns_info vita::dataframe::columns |
Definition at line 157 of file dataframe.h.