Random Sampling- Example
Let's consider a classic numerical problem that can be solved using a randomized algorithm: Random Sampling for Estimating the Mean.
Problem:
You have a large dataset and want to estimate the mean of the data. Instead of computing the mean of the entire dataset, you can use a randomized sampling approach to estimate it. This method involves selecting a random sample from the dataset and computing the mean of that sample.
Steps:
- Randomly Sample: Select a random subset (sample) of data points from the dataset.
- Compute Mean: Calculate the mean of the sampled data points.
- Estimate Mean: Use the mean of the sample as an estimate for the mean of the entire dataset.
Why This Works:
Random sampling provides an estimate of the mean with a good approximation to the true mean, especially when the sample size is large enough. The randomness helps in making sure that the sample is representative of the entire dataset. This method is simple and useful for large datasets where computing the mean of the entire dataset might be computationally expensive.
Python Implementation- Random Sampling
import random
def estimate_mean(data, sample_size):
"""
Estimate the mean of the dataset using random sampling.
Parameters:
data (list): The full dataset.
sample_size (int): The number of elements to sample
Returns:
float: The estimated mean of the dataset.
"""
# Randomly sample the data
sample = random.sample(data, sample_size)
# Compute the mean of the sample
sample_mean = sum(sample) / len(sample)
return sample_mean
# Example usage
data = [10, 20, 30, 40, 50, 60, 70, 80, 90, 100,12,23,28,55,60,78,56,49,69,29,35]
sample_size = 5
estimated_mean = estimate_mean(data, sample_size)
actual_mean=sum(data)/len(data)
print(f"Estimated Mean of the dataset: {estimated_mean:.2f}")
print(f"Actual Mean of the dataset: {actual_mean:.2f}")
Comments
Post a Comment