Project talk:OpenMLDatamodels
From MaRDI portal
Initial feedback on the data model for OpenML dataset items
The following remarks are based on this version of the documentation page and this version of the sample item.
On the documentation page, each statement type should link to the respective property. For a generally useful approach to sharing Wikibase data models, see the corresponding pages on some WikiProjects over on Wikidata, e.g. here. It is also advisable to create and document the necessary properties in advance in order to facilitate their discussion.
Some specific points regarding individual properties:
- dataset version
- The property description page is essentially empty. Should this be specific to OpenML or generic?
- Answer: We can keep it generic in my opinion
- Tim requested a name change to "dataset version identifier"
- Some of the other properties may change with the version number (certainly the checksum, for instance) — how to handle that?
- Answer: My plan was to always update it to the newest version, including all properties
- The property description page is essentially empty. Should this be specific to OpenML or generic?
- author name string
- keep track of order in the author list, as per series ordinal, so as to facilitate conversion to author statements
- Answer: Even though there is no meaningful order in OpenML, afaik?
- Yes.
- Answer: Even though there is no meaningful order in OpenML, afaik?
- keep track of order in the author list, as per series ordinal, so as to facilitate conversion to author statements
- default target attribute
- The property description page is essentially empty and needs to be fleshed out.
- Answer: Sure :) (this also applies to all other property pages)
- The property description page is essentially empty and needs to be fleshed out.
- checksum
- Depends on version, so should be coordinated with that ( see above)
- Answer: Also see above, this depends on how we handle the version
- Depends on version, so should be coordinated with that ( see above)
- has feature
- The property description page is essentially empty and needs to be fleshed out.
- this property will be removed
- The property description page is essentially empty and needs to be fleshed out.
- number of binary features
- The property description page is essentially empty and needs to be fleshed out.
- number of classes
- The property description page is essentially empty and needs to be fleshed out.
- number of features
- The property description page is essentially empty and needs to be fleshed out.
- number of instances
- The property description page is essentially empty and needs to be fleshed out.
- number of instances with missing values
- The property description page is essentially empty and needs to be fleshed out.
- number of missing values
- The property description page is essentially empty and needs to be fleshed out.
- number of numeric features
- The property description page is essentially empty and needs to be fleshed out.
- number of symbolic features
- The property description page is essentially empty and needs to be fleshed out.