dataset::source#
-
class source : public queryosity::action#
Custom dataset source.
Subclassed by queryosity::dataset::reader< DS >
Public Functions
-
virtual void parallelize(unsigned int concurrency) = 0#
Inform the dataset of parallelism.
-
inline virtual void initialize()#
Initialize dataset processing.
-
virtual std::vector<std::pair<unsigned long long, unsigned long long>> partition() = 0#
Determine dataset partition for parallel processing.
A non-empty partition MUST begin from the
0and be sorted contiguous order, e.g.:If a dataset returns an empty partition, it relinquishes its control over the entry loop to another dataset with a non-empty partition.{{0,100},{100,200}, ..., {900,1000}}
- Attention
Non-empty partitions reported from multiple datasets need to be aligned to form a common denominator partition over which the dataset processing is parallelized. As such, they MUST have (1) at minimum, the same total number of entries, and (2) ideally, shared sub-range boundaries.
Any dataset reporting an empty partition MUST be able to fulfill
dataset::source::execute()calls for any entry number as requested by the other datasets loaded in the dataflow.
- Returns:
Dataset partition
-
inline virtual void initialize(unsigned int slot, unsigned long long begin, unsigned long long end) override#
Enter an entry loop.
- Parameters:
slot – [in] Thread slot number.
begin – [in] First entry number processed.
end – [in] Loop stops after
end-1-th entry has been processed.
-
inline virtual void execute(unsigned int slot, unsigned long long entry) override#
Process an entry.
- Parameters:
slot – [in] Thread slot number.
entry – [in] Entry being processed.
-
inline virtual void finalize(unsigned int slot) override#
Exit an entry loop.
- Parameters:
slot – [in] Thread slot number.
entry – [in] Entry being processed.
-
inline virtual void finalize()#
Finalize processing the dataset.
-
virtual void parallelize(unsigned int concurrency) = 0#