数据挖掘 - 具有海量数据的 CDF - 吾爱随笔录

我正在尝试绘制我拥有的 7 个文件的 CDF。每个文件如下所示：

等等。问题在于文件的大小，它们是：

862M
1,8G
2,4G
18G
2,0G
1,8G

我已经建立了一个简单的 R 脚本，它只是加载文件并绘制它们。当我生成较小尺寸的假文件时，脚本工作正常。但是，它已经运行了三天，并且没有使用完整文件产生任何输出。

脚本是这样的：

library(ggplot2)    
data <- read.table('file1.csv')
data$g = "G1"
data2 <- read.table('file2.csv')
data2$g = "G2"
data3 <- read.table('file3.csv')
data4$g = "G3"
data4 <- read.table('file4.csv')
data4$g = "G4"
data5 <- read.table('file5.csv')
data5$g = "G5"
data6 <- read.table('file56.csv')
data6$g = "G6"
data7 <- read.table('file7.csv')
data7$g = "G7"

dftotal = rbind(data,data2)
dftotal = rbind(dftotal,data3)
dftotal = rbind(dftotal,data4)
dftotal = rbind(dftotal,data5)
dftotal = rbind(dftotal,data6)
dftotal = rbind(dftotal,data7)

gp <- ggplot(data = dftotal, aes(x = V1), group = factor(g)) + stat_ecdf()
ggsave('cdf.eps',gp)

有谁知道更有效的方法来做到这一点？