Simple random samples and stratified random samples differ in how the sample is drawn from the overall population of data. Simple random samples involve the random selection of data from the entire population so each possible sample is equally likely to occur. In contrast, stratified random sampling divides the population into smaller groups, or strata, based on shared characteristics. A random sample is taken from each stratum in direct proportion to the size of the stratum compared to the population. The sample subsets are then combined to create a random sample.
Simple random sampling and stratified sampling are both types of probability sampling where each sample has a known probability of being selected. This is different from judgemental sampling, where the units to be sampled are hand-picked by the researcher.
The population is the total set of observations or data. A sample is a set of observations from the population. The sampling method is the process used to pull samples from the population. A simple random sample is a random sample pulled from the entire population with no constraints placed on how the sample is pulled. This method has no bias in selecting the sample from the population, so each population element has an equal chance of being included in the sample.
Use of stratified random sampling in machine learning:
Stratified sampling aims at splitting one data set so that each split are similar with respect to something.
In a classification setting, it is often chosen to ensure that the train and test sets have approximately the same percentage of samples of each target class as the complete set.
As a result, if the data set has a large amount of each class, stratified sampling is pretty much the same as random sampling. But if one class isn’t much represented in the data set, which may be the case in your dataset since you plan to oversample the minority class, then stratified sampling may yield a different target class distribution in the train and test sets than what random sampling may yield.
Note that the stratified sampling may also be designed to equally distribute some features the next train and test set. For example, if each sample represent one individual, and one feature is age, it is sometimes useful to have the same age distribution in both the train and test set.
For reference: https://en.wikipedia.org/wiki/Stratified_sampling