Opened 16 years ago

Last modified 10 years ago

#3499 closed Bug report

BIG site: Slow listing, great CPU usage

Reported by: ellach Owned by:
Priority: normal Component: Other
Keywords: Cc: ellach, Alexander Schuch, Tim Kosse, nlunn
Component version: Operating system type:
Operating system version:

Description

Hi,

I use Filezilla 3.0.9.2, Windows XP SP2, 1Gb RAM, to connect and download a full website, with user-generated content. This website has about 15,000 directories, with more than 20,000 files. About 7Gb everything.

When Filezilla gets 4,000-5,000 files it starts to slow down overall performance, and also CPU usage is going up to a continuous 50%.

It does not crash. It only makes the computer and the download very slow.

Change History (8)

comment:1 by Alexander Schuch, 16 years ago

Does it slow down after 4000-5000 files were downloaded, or after that amount of files were added to the transfer queue?

comment:2 by ellach, 16 years ago

First I've tested to have two simultaneous connections: One listing dirs and another one downloading.
Then i've tested to have one connection: First list then download.

In both cases i've found the same problem. Maybe the tree is too big. Or the transfer queue (with downloaded files) is too large.

In both cases it takes minutes to get the list for firsts thousands of files, but after that, it dramatically slows down performance.

comment:3 by Alexander Schuch, 16 years ago

If you right-click on a remote directory and then pick "Add to queue" from context menu, FileZilla travels recursively through the remote directory, adding all found file to the transfer queue. If the directory tree is quite large, it takes some time for FileZilla to get the whole tree. If you check the message log, you should see FileZilla working.

Even if you do not transfer the files but only add them to the transfer queue, it gets slower and slower? What if you right-click in the transfer queue and select "clear queue". Does it go fast again, just to slow down again if the transfer queue gets filled again?

Please try this and report back. And, which CPU do you have (model/clock).

comment:4 by ellach, 16 years ago

Yes, it happens the following.
After adding some thousands of files to download queue, it starts to get slower. But not the connection, only "what happens after the LIST command"

Comando: CWD /web/uploads/glossary/
Respuesta: 250 OK. Current directory is /web/uploads/glossary
Comando: PWD
Respuesta: 257 "/web/uploads/glossary" is your current location
Comando: CWD 2343
Respuesta: 250 OK. Current directory is /web/uploads/glossary/2343
Comando: PWD
Respuesta: 257 "/web/uploads/glossary/2343" is your current location
Comando: PASV
Respuesta: 227 Entering Passive Mode (212,36,74,242,35,21)
Comando: LIST
Respuesta: 150 Accepted data connection
Respuesta: 226-ASCII
Respuesta: 226-Options: -a -l
Respuesta: 226 3 matches total

Sending this commands, and recieve the answers, is really quick, but after the LIST command is sent (and possibly, the answer recieved) it takes some seconds, with the CPU at 50% to list the next directory. The longer the queue, the highest the time between LIST and the next command.

In fact, doing this makes it slower: After having the computer doing this for 3 days it couldn't get the whole directory/file list.

I haven't seen the code, but as a (pascal and php) programmer, i would supouse:
a) You go throught all the list each time a file is added.
b) You add the item directly to the visual component, which refreshes the whole list for each item added.

Possible solutions
a) A linked list (not just an array), having a pointer to last item added.
b) - Add files to queue and refreshing the GUI in two separated threads

  • Update the GUI element each "n" items added
  • Enable a "big site mode" (or "mirror site" mode), which will work in another way.

I've a P4 dual core CPU, 3GHz, 1Gb RAM.

comment:5 by Tim Kosse, 16 years ago

Part of the problem is that currently the engine-internal cache is stored unsorted and that sorting is purely done by the UI. So the cache lookups have to scan sequentially through the listings.

With a sorted list, it could do binary search instead. Unfortunately that would increase memory requirements for lists as they would need to be stored both sorted case-insensitive as well as case-sensitive.

comment:6 by ellach, 16 years ago

Then let me suggest to implement the full download process without UI elements.

Also let me suggest implement a "mirror mode", with only 1 connection, a la wget:

Enter in a Directory
List its files
download its files
continue in its childs directories

Thank you,
Edu

comment:7 by nlunn, 16 years ago

Thanks for this great program. I have looked for a long time after a program that could replace LeechFTP and the followup (bitbeamer), and now I have found it.

I have had some of the same problems ad the original poster. It seems like the application crash when trying to download a big site (200-500 dirs, 140k files)

To solve my problem I have as disable the UI controls, and minimize the UI so nothing had to be rendered on the screen. I did not start the transfer, but instead I added the items to the queue and when done I started processing the queue. I guess this took about 15 min on my computer.

To me it seems like some events are triggered every time a file is found and added to the queue.
If it is the refresh of the UI queue that takes the time, I suggest that you disable refresh of the control and only refresh it when a full directory is added. Secondarily you could update the ui on a timer, no need for me at least to have the queue updated as frequently as it is now.

comment:8 by Tim Kosse, 16 years ago

Try the most recent nightly build. FileZilla should be significantly faster.

Note: See TracTickets for help on using tickets.