时间内近似得出答案,是您希望测试的样本行数。为此:O(NlogN+kN)k
- 选择一组均匀的点跨越范围 [0,π)。这些是您的示例行。kθ
- 的行开始循环。( )
θ=0O(N)
- 的方向上得到一个单位向量。( )uθO(N)
- 得到一个向量垂直于这个向量。( )u^O(N)
- 将所有点投影到上。( )u^O(N)
- 沿对点进行排序。( )u^O(N)
- 沿着已排序的点滑动一个与您的页边距大小相同的窗口,并注意最大数量的点何时在此窗口内以及发生这种情况时窗口的中点。( )O(N)
- 返回有关具有最大点数的行的信息。
该算法需要时间来对点进行初始排序。然而,在此之后,我们将这些点投影到一条缓慢旋转的线上。缓慢的旋转意味着每次重投影都会导致点已经非常接近它们之前排序的位置:O(NlogN)

大多数排序算法能够在几乎线性的时间内处理这种情况。还要注意,随着样本数量的增加,点已经排序的程度也会增加。
完整的算法如下所示:
#include <algorithm>
#include <cassert>
#include <cmath>
#include <functional>
#include <iostream>
#include <random>
#include <vector>
class FracPoint {
public:
float frac = std::numeric_limits<float>::quiet_NaN();
float x;
float y;
FracPoint(float x0, float y0) : x(x0), y(y0) {}
bool operator<(const FracPoint &o) const {
return frac<o.frac;
}
bool operator>(const FracPoint &o) const {
return frac>o.frac;
}
};
struct BestLine {
int count;
double slope;
double intercept;
};
typedef std::vector<FracPoint> Points;
typedef std::pair<int,double> CountIntercept;
///Projects points to their closest orthogonal point on a line at an angle theta
///to the x-axis. Sets each point's `frac` property to the number of unit
///vectors their projection is from the origin. This can be used to sort points
///along the length of the line.
void GetFractionalDistPointsProjectedToVector(
Points &pts,
const double theta
){
const double tvx = std::cos(theta);
const double tvy = std::sin(theta);
//Get a line perpendicular to the one we just generated. This is the line we
//sort the points along to find the maximal margin.
const auto vx = -tvy;
const auto vy = tvx;
//Do a dot product (v dot s)/(s dot s) and multiply by s to get projection
for(auto &pt: pts)
pt.frac = (pt.x*vx+pt.y*vy)/(vx*vx+vy*vy);
}
CountIntercept FindBestIntercept(Points &pts, const double margin){
Points::iterator upper=pts.begin();
Points::iterator lower=pts.begin();
int best_count = 0;
double best_pos = 0;
for(;upper!=pts.end();++upper){
while(upper->frac-lower->frac>margin)
++lower;
const auto count = upper-lower+1;
if(count>best_count){
best_count = count;
best_pos = (upper->frac-lower->frac)/2;
}
}
return CountIntercept(best_count,best_pos);
}
BestLine FindBestLine(Points &pts, const int samples, const double margin){
int best_count = 0;
double best_theta;
for(int i=0;i<samples;i++){
const double theta = i*(M_PI/samples);
//Project points onto the line defined by theta
GetFractionalDistPointsProjectedToVector(pts, theta);
//Since points are already mostly sorted from the last projection, use
//insertion sort's O(N) best case scenario to make them totally sorted.
// ska_sort(pts.begin(),pts.end(),[](const FracPoint &pt){return pt.frac;});
std::sort(pts.begin(),pts.end());
const auto count_intercept = FindBestIntercept(pts, margin);
if(count_intercept.first>best_count){
best_count = count_intercept.first;
best_theta = theta;
}
}
//Project points onto the line defined by the best theta
GetFractionalDistPointsProjectedToVector(pts, best_theta);
//Since points are being projected onto a "randomly" chosen line, we use an
//O(N log N) sort
std::sort(pts.begin(),pts.end());
const auto count_intercept = FindBestIntercept(pts, margin);
//`count_intercept.second` isn't actually the intercept, it's the distance
//along a line perpendicular to the one defined theta at which the best
//intercept is located. Let's find the real intercept now.
double intercept;
{
//Get vector along theta
const double tvx = std::cos(best_theta);
const double tvy = std::sin(best_theta);
//Get perpendicular vector
auto vx = -tvy;
auto vy = tvx;
//Multiply by `counter_intercept.second` to get absolute position of best
//point.
vx *= count_intercept.second;
vy *= count_intercept.second;
//Get slope of line
double slope = std::tan(best_theta);
//Use point-slope equation of line y-y1=m*(x-x1) and reformulate to y=m*x+b
//to recover the actual intercept
intercept = -slope*vx+vy;
}
return {best_count, std::tan(best_theta), intercept};
}
int main(int argc, char **argv){
if(argc!=3){
std::cout<<"Syntax: "<<argv[0]<<" <N> <Samples>"<<std::endl;
return -1;
}
const int N = std::stoi(std::string(argv[1]));
const int samples = std::stoi(std::string(argv[2]));
//Poor way of initializing PRNG. Fine for this application.
std::mt19937 gen(1234567);
std::uniform_real_distribution<> dis(-100, 100);
auto dice = std::bind(dis,gen);
Points pts;
for(int i=0;i<N;i++)
pts.emplace_back(dice(),dice());
const auto ret = FindBestLine(pts, samples, 25);
std::cout<<"Count = "<<ret.count<<std::endl;
std::cout<<"Best slope = "<<ret.slope<<std::endl;
std::cout<<"Best intercept = "<<ret.intercept<<std::endl;
return 0;
}
现在,我们为算法在不同数量的点上计时 100 个样本,结果如下:
1000, 0.014
10000, 0.083
50000, 0.445
100000, 0.943
500000, 5.615
1000000, 12.092
5000000, 68.748
10000000, 144.155
实际上,它几乎是线性的(在 R2=0.9962 的对数图上的斜率为 1.028)。
OpenMP 可用作并行化操作的简单方法:
#pragma omp parallel for firstprivate(pts)
for(int i=0;i<samples;i++){
const double theta = i*(M_PI/samples);
//Project points onto the line defined by theta
GetFractionalDistPointsProjectedToVector(pts, theta);
//Since points are already mostly sorted from the last projection, use
//insertion sort's O(N) best case scenario to make them totally sorted.
// ska_sort(pts.begin(),pts.end(),[](const FracPoint &pt){return pt.frac;});
std::sort(pts.begin(),pts.end());
const auto count_intercept = FindBestIntercept(pts, margin);
#pragma omp critical
if(count_intercept.first>best_count){
best_count = count_intercept.first;
best_theta = theta;
}
}