I've noticed that the PProxy takes a while to refresh the stats, longer than 30 minutes. The blocks going to D.Net aren't delayed are they?
I'm pretty sure they aren't being lost at all. As long as you can still send/receive to the pproxy, the stats will mind their own business.
I too have noticed the stats runs have been taking longer -> 10-15 minutes. Now, it appears they aren't running at all - we're still stuck at 19:00 GMT. Hopefully, we'll get an update soon from Virus as to what's going on.
The stats script now takes 15-16 mins to finish. Without the by hosts page. It would only take 12 mins. Other then that most of the time is spent collecting data from the 70+ Mb of log files. That takes about 9 mins.
I am working on porting the perl script over to php. Then adding Mysql database support. So it will only have to collect the changes from the last 30 mins. Instead of the last 3 months.
Apache went balistic at around 19:30 GMT. It does this at least once a day, and it is getting worst the bigger the log files get.
What I mean about balistic. Is that it starts using as much cpu resources as it can. And since apache has a higher priority then the stats script. The scripts never get finished. So they just pile up. The more scripts that are running, the slower the whole system gets.
I have rebooted the server and the 0:00 stats script is started. So you should have new stats at 0:15 GMT.
It seems this is spiraling out to keep making it harder and harder to have quickly updated stats along with keeping the server easily maintainable.
Any ideas how you will deal with this problem in the future, Virus? I'm hoping to start dumping 10-20k blocks/day through it (if it can handle the extra load), and I'm sure it's only going to grow and grow. I see us regularly putting 200-300 MKey/s worth of blocks a day through it sometime in the not-too terribly distant future. Will this bring the server to its knees?
Well, going the way we are. In another month I will have to update the stats every hour (just like anandtech's proxy had to do).
I am going to put a higher priority on porting the stats script to php. Then it would take less then a minute to update. The only problem is a large database (upto 5000 entries per user).
How would it work if you had 2 local servers...
One serving as a proxy server and the other as a stats server? I know we discussed this earlier about using your machine as well as mine, but as we now know, it wouldn't work cause we'd have to transfer a 70MB log file - which isn't possible.
However, if you had 2 machines on a 100MB network. I think it would be much faster. All you would need for the proxy would be a P166-233. Then set up the stat server which reads the log file from the other machine. What do you think??
BTW, an Ultra SCSI 2 hard drive might help also...
Let me know what you think, and how it would work!
I am all ready ahead of you on that. I just got my internal 100 baseT network up last week. Presently between my K6-2 300 and my Cel366 @413-550. I am looking into getting a dual celeron or a K7 to replace the K6-2 or add to the mixture.
The best solution to use. Is to run only the script from the second machine. Since the script is probably what is causing apache (web server) to lose it.
The php script with database is not that far off. And it also allows for customization and integrated cpu archive with each user. Plus whatever else my diabolical mind creates.
I know this idea will probably get alot of negative feedback on, but there is alot of stat stuff that probably no one even goes and sees.
You said you already got rid of "by host", I think the "by full detail" and "by domain" and "hour, month, week, year" could go bye bye too.
Those stat pages probably get lil traffic, but maybe i'm wrong.
But a thought that maybe we could vote on or discuss more on.
The By host page takes around three minutes to create. That is because it has to search for new ip addresses then resolve for host. I will only run it maybe once a week. It is very useful for tracking email@example.com users. All the other pages take less then 5 secs. So removing them won't make a difference.
I'll see if I can strip some more useless or inefficient coding out.
I feel that the by host is useful in that it allows you to keep track of various machines that are working for you and how much they are producing.
It would seem that a faster machine would speed things up. That new Abit dual 370 M/board looks good at an econmical price. My advice is for everyone that wants dual to get one while you can because I heard that Intel intends to remove the pin that allows the celeron 370 to work in dual mode .
Did you say that the by host is only updated weekly?
[This message has been edited by cwizard (edited 08-18-99).]
I was going to just run the whole by host section only once a week, but instead I am going to keep the by host updated every half hour and run the resolve host (the hog) section once a week. So any new members or new connections will only show up on the by host stats once a week.
I can now get the new BP6 from my supplier. I will see how much cash flow I have left after the move (if I move). Don't worry about the pproxy. It will only be down for an hour at most. That's if I move.
New Security Features Planned for Firefox 4
Another Laptop Theft Exposes 21K Patients' Data
Oracle Hits to Road to Pitch Data Center Plans
Microsoft Preps Array of Windows Patches
Microsoft Nears IE9 Beta With Final Preview
Simplified Analytics Improve CRM, BI Tools
Android Passes RIM as Top Mobile OS in 2Q
VMware Updates Hyperic System Management
File Monitoring Key to Enterprise Security
LinkedIn Snaps Up SaaS Player mSpoke