如何设计和测试 Rails 中的大量并发数据？

发布于 2024-10-15 02:13:41 字数 1139 浏览 4 评论 0原文

问候堆垛机。

我们正在开发一个项目，为心理实验的参与者存储每秒的跟踪数据。我们当前的设计有一个 Flash 客户端，它收集 60 秒的时间戳/活动配对，然后将数据作为字符串以及一些参与者元数据发布到我们的 Rails (3.0.3) / MySQL (5.1) 应用程序。编辑我们在前端使用普通的 Passenger/Nginx。 Rails 将时间戳/活动字符串拆分为并行数组，生成单个原始 SQL 插入语句，然后将所有内容推入一个庞大的表中，即：（简化代码）

@feedback_data = params[:feedbackValues].split(",")
@feedback_times = params[:feedbackTimes].split(",")
inserts = []
base = "(" + @userid + "," + @studyid + ","
@feedback_data.each_with_index do |e,i|
  record = base + @feedback_times[i].to_s + ","
  record += "'" + @feedback_data[i].to_s + "')"
  inserts.push(record)
end
sql = "INSERT INTO excitement_datas (participantId, studyId, timestamp, activityLevel) VALUES #{inserts.join(", ")}"
ActiveRecord::Base.connection.execute sql

产量：

INSERT INTO STUDY_DATA (participantId, studyId, timestamp, activityLevel)
VALUES (3,5,2011-01-27 05:02:21,47),(3,5,2011-01-27 05:02:22,56),etc.

该设计在团队中引起了很多争论。研究将有数十或数百人同时参与。我已经为每个客户端错开了 60 秒的 POST 间隔，以便传入的数据分布得更均匀，但我仍然收到很多悲观的预测。

我们还可以/应该做什么来提高 Rails 设计的可扩展性？

我可以使用哪些工具/技术来准确预测其在负载下的表现？

非常感谢。

原文

Greetings Stackers.

We're working on a project which stores second-to-second tracking data for participants in psych experiments. Our current design has a Flash client which collects 60 seconds worth of timestamp/activity pairings and then posts the data as strings, along with a little participant metadata to our rails (3.0.3) / MySQL (5.1) application. Edit We're using vanilla Passenger/Nginx for the front. Rails splits the timestamp/activity strings into parallel arrays, generates a single raw SQL insert statement, and then shoves everything into a massive table, i.e:
(simplified code)

@feedback_data = params[:feedbackValues].split(",")
@feedback_times = params[:feedbackTimes].split(",")
inserts = []
base = "(" + @userid + "," + @studyid + ","
@feedback_data.each_with_index do |e,i|
  record = base + @feedback_times[i].to_s + ","
  record += "'" + @feedback_data[i].to_s + "')"
  inserts.push(record)
end
sql = "INSERT INTO excitement_datas (participantId, studyId, timestamp, activityLevel) VALUES #{inserts.join(", ")}"
ActiveRecord::Base.connection.execute sql

Yields:

INSERT INTO STUDY_DATA (participantId, studyId, timestamp, activityLevel)
VALUES (3,5,2011-01-27 05:02:21,47),(3,5,2011-01-27 05:02:22,56),etc.

The design has generated a lot of debate on the team. Studies will have 10s or 100s of concurrent participants. I've staggered the 60 second POST interval for each client so that incoming data is distributed more evenly, but I'm still getting lots of doom and gloom predictions.

What else can we do / should we do to improve the scalability of this design in rails?

What tools / techniques can I use to accurately predict how this performs under load?

Many thanks.

分享到QQ

分享到微博