Opened 10 years ago

Last modified 10 years ago

#9118 new Bug report

utf8 and non-utf8 characters corrupt filezilla cache?

Reported by: Arkadiusz Miskiewicz Owned by:
Priority: normal Component: FileZilla Client
Keywords: utf-8 Cc:
Component version: Operating system type: Linux
Operating system version: PLD Linux Th

Description

Test case for filezilla 3.7.3

Go to:
ftp://arm-filezilla@ftp.arm.beep.pl/
(password is: filezilla-project.org
the access will be there for few weeks for testing)

There is structure like shown below (using lftp unix cmdline client; it uses OPTS UTF8 ON).

Some entries have proper utf8 encoding like "test_polskich_znaków" while other have cp1250 encoding ("RZ 47 Jelenia G�ra").

According to https://wiki.filezilla-project.org/Character_Set filezilla won't support invalid encoding. Well, that not great solution (it could always indicate that there are some files, like lftp does, so user would be aware of the problem) but this bug isn't about this.

If I enter into "test_polskich_znaków"
( http://imageshack.us/a/img6/9378/0aoc.jpg )
and then into "gfgdfg" then directory list in filezilla gets (sometimes) corrupted ( http://imageshack.us/a/img543/2641/op5b.jpg ). filezilla starts thinking that there is such weird/corrupted, second directory there along with properly encoded one. Why ? No idea but it shouldn't do that.

lftp arm-filezilla@ftp.arm.beep.pl:/> ls -alR
drwxr-xr-x    2 0          999                42 Dec 10 11:20 .
drwxr-xr-x    2 0          999                42 Dec 10 11:20 ..
drwx---r-x    3 10089      999                42 Dec 10 11:21 filezilla.arm.beep.pl


./filezilla.arm.beep.pl:

drwx---r-x    3 10089      999                42 Dec 10 11:21 .
drwxr-xr-x    2 0          999                42 Dec 10 11:20 ..
drwxr-xr-x    6 29634      nogroup           110 Dec 10 11:15 test_polskich_znaków


./filezilla.arm.beep.pl/test_polskich_znaków:

drwxr-xr-x    6 29634      nogroup           110 Dec 10 11:15 .
drwxrwxrwx    1 0          0                  22 Dec 10 11:20 ..
drwxr-xr-x    4 29634      nogroup            60 Nov 19 10:35 RZ 47 Jelenia G�ra
drwxr-xr-x    2 29634      nogroup            34 Dec 10 11:21 RZ 47 Wa�brzych
drwxr-xr-x    2 29634      nogroup            34 Dec 10 11:22 RZ47 ZIELONA G�RA
drwxr-xr-x    2 29634      nogroup            10 Dec 10 11:09 gfgdfg


./filezilla.arm.beep.pl/test_polskich_znaków/RZ 47 Jelenia G�ra:

drwxr-xr-x    4 29634      nogroup            60 Nov 19 10:35 .
drwxr-xr-x    6 29634      nogroup           110 Dec 10 11:15 ..
drwxr-xr-x    2 29634      nogroup            34 Dec 10 11:21 2 sweter -50%
drwxr-xr-x    2 29634      nogroup            34 Dec 10 11:21 SF JELENIA G�RA


./filezilla.arm.beep.pl/test_polskich_znaków/RZ 47 Jelenia G�ra/2 sweter -50%:

drwxr-xr-x    2 29634      nogroup            34 Dec 10 11:21 .
drwxr-xr-x    4 29634      nogroup            60 Nov 19 10:35 ..
-rw-r--r--    1 29634      nogroup             0 Dec 10 11:21 testfile1.txt


./filezilla.arm.beep.pl/test_polskich_znaków/RZ 47 Jelenia G�ra/SF JELENIA G�RA:

drwxr-xr-x    2 29634      nogroup            34 Dec 10 11:21 .
drwxr-xr-x    4 29634      nogroup            60 Nov 19 10:35 ..
-rw-r--r--    1 29634      nogroup             0 Dec 10 11:21 testfile2.txt


./filezilla.arm.beep.pl/test_polskich_znaków/RZ 47 Wa�brzych:

drwxr-xr-x    2 29634      nogroup            34 Dec 10 11:21 .
drwxr-xr-x    6 29634      nogroup           110 Dec 10 11:15 ..
-rw-r--r--    1 29634      nogroup             0 Dec 10 11:21 testfile3.txt


./filezilla.arm.beep.pl/test_polskich_znaków/RZ47 ZIELONA G�RA:

drwxr-xr-x    2 29634      nogroup            34 Dec 10 11:22 .
drwxr-xr-x    6 29634      nogroup           110 Dec 10 11:15 ..
-rw-r--r--    1 29634      nogroup             0 Dec 10 11:22 testfile4.txt


./filezilla.arm.beep.pl/test_polskich_znaków/gfgdfg:

drwxr-xr-x    2 29634      nogroup            10 Dec 10 11:09 .
drwxr-xr-x    6 29634      nogroup           110 Dec 10 11:15 ..

Change History (4)

comment:1 by Arkadiusz Miskiewicz, 10 years ago

Log from filezilla (filezilla was run in LANG=en_US.UTF-8 env this time; previously LANG=pl_PL.UTF-8)

Note filezillas:

Command:	CWD gfgdfg
Response:	250 OK. Current directory is /filezilla.arm.beep.pl/test_polskich_znaków/gfgdfg

vs lftp:

---> CWD gfgdfg
<--- 250 OK. Current directory is /filezilla.arm.beep.pl/test_polskich_znaków/gfgdfg

filezilla somehow incorrectly interpretes character in CWD response?

Status:	Resolving address of ftp.arm.beep.pl
Status:	Connecting to 193.239.44.109:21...
Status:	Connection established, waiting for welcome message...
Response:	220-Welcome to Pure-FTPd.
Response:	220-You are user number 8 of 100 allowed.
Response:	220-Local time is now 11:47. Server port: 21.
Response:	220-This is a private system - No anonymous login
Response:	220-IPv6 connections are also welcome on this server.
Response:	220 You will be disconnected after 7 minutes of inactivity.
Command:	USER arm-filezilla
Response:	331 User arm-filezilla OK. Password required
Command:	PASS *********************
Response:	230 OK. Current restricted directory is /
Command:	SYST
Response:	215 UNIX Type: L8
Command:	FEAT
Response:	211-Extensions supported:
Response:	 EPRT
Response:	 IDLE
Response:	 MDTM
Response:	 SIZE
Response:	 MFMT
Response:	 REST STREAM
Response:	 MLST type*;size*;sizd*;modify*;UNIX.mode*;UNIX.uid*;UNIX.gid*;unique*;
Response:	 MLSD
Response:	 AUTH TLS
Response:	 PBSZ
Response:	 PROT
Response:	 UTF8
Response:	 ESTA
Response:	 PASV
Response:	 EPSV
Response:	 SPSV
Response:	 ESTP
Response:	211 End.
Command:	OPTS UTF8 ON
Response:	200 OK, UTF-8 enabled
Status:	Connected
Status:	Retrieving directory listing...
Command:	PWD
Response:	257 "/" is your current location
Command:	TYPE I
Response:	200 TYPE is now 8-bit binary
Command:	PASV
Response:	227 Entering Passive Mode (193,239,44,109,181,11)
Command:	MLSD
Response:	150 Accepted data connection
Response:	226-Options: -a -l 
Response:	226 3 matches total
Listing:	type=cdir;sizd=42;modify=20131210102010;UNIX.mode=0755;UNIX.uid=0;UNIX.gid=999;unique=fe01g1c00147cc; .
Listing:	type=pdir;sizd=42;modify=20131210102010;UNIX.mode=0755;UNIX.uid=0;UNIX.gid=999;unique=fe01g1c00147cc; ..
Listing:	type=dir;sizd=42;modify=20131210102010;UNIX.mode=0705;UNIX.uid=10089;UNIX.gid=999;unique=fe01gb26c8438; filezilla.arm.beep.pl
Status:	Directory listing successful
Status:	Retrieving directory listing...
Command:	CWD filezilla.arm.beep.pl
Response:	250 OK. Current directory is /filezilla.arm.beep.pl
Command:	PWD
Response:	257 "/filezilla.arm.beep.pl" is your current location
Command:	PASV
Response:	227 Entering Passive Mode (193,239,44,109,188,192)
Command:	MLSD
Response:	150 Accepted data connection
Response:	226-Options: -a -l 
Response:	226 3 matches total
Listing:	type=cdir;sizd=42;modify=20131210102117;UNIX.mode=0705;UNIX.uid=10089;UNIX.gid=999;unique=fe01gb26c8438; .
Listing:	type=pdir;sizd=42;modify=20131210102010;UNIX.mode=0755;UNIX.uid=0;UNIX.gid=999;unique=fe01g1c00147cc; ..
Listing:	type=dir;sizd=110;modify=20131210101514;UNIX.mode=0755;UNIX.uid=29634;UNIX.gid=65534;unique=fe01g1127440fe; test_polskich_znaków
Status:	Directory listing successful
Status:	Retrieving directory listing...
Command:	CWD test_polskich_znaków
Response:	250 OK. Current directory is /filezilla.arm.beep.pl/test_polskich_znaków
Command:	PWD
Response:	257 "/filezilla.arm.beep.pl/test_polskich_znaków" is your current location
Command:	PASV
Response:	227 Entering Passive Mode (193,239,44,109,183,69)
Command:	MLSD
Response:	150 Accepted data connection
Response:	226-Options: -a -l 
Response:	226 6 matches total
Listing:	type=cdir;sizd=110;modify=20131210101514;UNIX.mode=0755;UNIX.uid=29634;UNIX.gid=65534;unique=fe01g1127440fe; .
Listing:	type=pdir;sizd=42;modify=20131210102010;UNIX.mode=0705;UNIX.uid=10089;UNIX.gid=999;unique=fe01gb26c8438; ..
Status:	Invalid character sequence received, disabling UTF-8. Select UTF-8 option in site manager to force UTF-8.
Listing:	
Listing:	
Listing:	
Listing:	type=dir;sizd=10;modify=20131210100904;UNIX.mode=0755;UNIX.uid=29634;UNIX.gid=65534;unique=fe01g1521cb43d; gfgdfg
Status:	Directory listing successful
Status:	Retrieving directory listing...
Command:	CWD gfgdfg
Response:	250 OK. Current directory is /filezilla.arm.beep.pl/test_polskich_znaków/gfgdfg
Command:	PWD
Response:	257 "/filezilla.arm.beep.pl/test_polskich_znaków/gfgdfg" is your current location
Command:	PASV
Response:	227 Entering Passive Mode (193,239,44,109,176,113)
Command:	MLSD
Response:	150 Accepted data connection
Response:	226-Options: -a -l 
Response:	226 2 matches total
Listing:	type=cdir;sizd=10;modify=20131210100904;UNIX.mode=0755;UNIX.uid=29634;UNIX.gid=65534;unique=fe01g1521cb43d; .
Listing:	type=pdir;sizd=110;modify=20131210101514;UNIX.mode=0755;UNIX.uid=29634;UNIX.gid=65534;unique=fe01g1127440fe; ..
Status:	Directory listing successful

And the same from lftp client:

[arekm@t400 ~]$ lftp ftp://arm-filezilla@ftp.arm.beep.pl
Hasło: 
lftp arm-filezilla@ftp.arm.beep.pl:~> debug 30
lftp arm-filezilla@ftp.arm.beep.pl:~> ls
FileCopy(0x200fe40) enters state INITIAL
FileCopy(0x200fe40) enters state DO_COPY
---- dns cache hit
---- attempt number 0
---- attempt number 1
---- Łączenie się z ftp.arm.beep.pl (193.239.44.109) port 21
<--- 220-Welcome to Pure-FTPd. 
<--- 220-You are user number 8 of 100 allowed.
<--- 220-Local time is now 11:49. Server port: 21.
<--- 220-This is a private system - No anonymous login
<--- 220-IPv6 connections are also welcome on this server.
<--- 220 You will be disconnected after 7 minutes of inactivity.
---> FEAT
<--- 211-Extensions supported:
<---  EPRT
<---  IDLE
<---  MDTM
<---  SIZE
<---  MFMT
<---  REST STREAM
<---  MLST type*;size*;sizd*;modify*;UNIX.mode*;UNIX.uid*;UNIX.gid*;unique*;
<---  MLSD
<---  AUTH TLS
<---  PBSZ
<---  PROT
<---  UTF8
<---  ESTA
<---  PASV
<---  EPSV
<---  SPSV
<---  ESTP
<--- 211 End.
---> AUTH TLS
<--- 234 AUTH TLS OK.
---> OPTS UTF8 ON
Certificate depth: 2; subject: /C=US/O=GeoTrust Inc./CN=GeoTrust Global CA; issuer: /C=US/O=GeoTrust Inc./CN=GeoTrust Global CA
Certificate depth: 1; subject: /C=US/O=GeoTrust, Inc./CN=RapidSSL CA; issuer: /C=US/O=GeoTrust Inc./CN=GeoTrust Global CA
Certificate depth: 0; subject: /serialNumber=NRyS9MhLFtTRvHp8tM21ueXdQdvEqqqs/OU=GT58848931/OU=See www.rapidssl.com/resources/cps (c)13/OU=Domain Control Validated - RapidSSL(R)/CN=*.beep.pl; issuer: /C=US/O=GeoTrust, Inc./CN=RapidSSL CA
Certificate verification: subjectAltName: `ftp.arm.beep.pl' matched
<--- 200 OK, UTF-8 enabled
---> OPTS MLST type;size;modify;UNIX.mode;UNIX.uid;UNIX.gid;
<--- 200  MLST OPTS type;size;sizd;modify;UNIX.mode;UNIX.uid;UNIX.gid;unique;
---> USER arm-filezilla
<--- 331 User arm-filezilla OK. Password required
---> PASS XXXX
<--- 230 OK. Current restricted directory is /
---> PWD
<--- 257 "/" is your current location
---- attempt number 1
---> PBSZ 0
<--- 200 PBSZ=0                       
---> PROT P
<--- 200 Data protection level set to "private"
---> PASV
<--- 227 Entering Passive Mode (193,239,44,109,178,243)
---- Łączenie się (gniazdo danych) z (193.239.44.109) port 45811
---- Ustanowiono połączenie dla danych
0:0 translated to pair 0:0 (0,0)
0 translated to pair 0:0 (0,0)
---> LIST
<--- 150 Accepted data connection
0:0 translated to pair 0:0 (0,0)
0 translated to pair 0:0 (0,0)
Certificate verification: subjectAltName: `ftp.arm.beep.pl' matched
<--- 226-Options: -a -l 
<--- 226 3 matches total
---- Got EOF on data connection
---- Zamykanie gniazda danych
drwxr-xr-x    2 0          999                42 Dec 10 11:20 .
drwxr-xr-x    2 0          999                42 Dec 10 11:20 ..
drwx---r-x    3 10089      999                42 Dec 10 11:21 filezilla.arm.beep.pl
copy: get hit eof
copy: waiting for put confirmation
FileCopy(0x200fe40) enters state CONFIRM_WAIT
copy: put confirmed store
FileCopy(0x200fe40) enters state GET_DONE_WAIT
copy: get is finished - all done
FileCopy(0x200fe40) enters state ALL_DONE
---- attempt number 0
---> TYPE I
<--- 200 TYPE is now 8-bit binary
lftp arm-filezilla@ftp.arm.beep.pl:/> cd filezilla.arm.beep.pl/
lftp arm-filezilla@ftp.arm.beep.pl:/filezilla.arm.beep.pl> ls
FileCopy(0x2046d00) enters state INITIAL
FileCopy(0x2046d00) enters state DO_COPY
---- attempt number 0
---- CWD path to be sent is `/filezilla.arm.beep.pl'
---> CWD filezilla.arm.beep.pl
<--- 250 OK. Current directory is /filezilla.arm.beep.pl
---> TYPE A
<--- 200 TYPE is now ASCII
---> PASV
<--- 227 Entering Passive Mode (193,239,44,109,184,135)
---- Łączenie się (gniazdo danych) z (193.239.44.109) port 47239
---- Ustanowiono połączenie dla danych
0:0 translated to pair 0:0 (0,0)
0 translated to pair 0:0 (0,0)
---> LIST
<--- 150 Accepted data connection
Certificate verification: subjectAltName: `ftp.arm.beep.pl' matched
<--- 226-Options: -a -l 
<--- 226 3 matches total
---- Got EOF on data connection
---- Zamykanie gniazda danych
drwx---r-x    3 10089      999                42 Dec 10 11:21 .
drwxr-xr-x    2 0          999                42 Dec 10 11:20 ..
drwxr-xr-x    6 29634      nogroup           110 Dec 10 11:15 test_polskich_znaków
copy: get hit eof
copy: waiting for put confirmation
FileCopy(0x2046d00) enters state CONFIRM_WAIT
copy: put confirmed store
FileCopy(0x2046d00) enters state GET_DONE_WAIT
copy: get is finished - all done
FileCopy(0x2046d00) enters state ALL_DONE
---- attempt number 0
---> TYPE I
<--- 200 TYPE is now 8-bit binary
lftp arm-filezilla@ftp.arm.beep.pl:/filezilla.arm.beep.pl> cd test_polskich_znaków/
lftp arm-filezilla@ftp.arm.beep.pl:/filezilla.arm.beep.pl/test_polskich_znaków> ls
FileCopy(0x2010040) enters state INITIAL
FileCopy(0x2010040) enters state DO_COPY
---- attempt number 0
---- CWD path to be sent is `/filezilla.arm.beep.pl/test_polskich_znaków'
---> CWD test_polskich_znaków
<--- 250 OK. Current directory is /filezilla.arm.beep.pl/test_polskich_znaków
---> TYPE A
<--- 200 TYPE is now ASCII
---> PASV
<--- 227 Entering Passive Mode (193,239,44,109,183,48)
---- Łączenie się (gniazdo danych) z (193.239.44.109) port 46896
---- Ustanowiono połączenie dla danych
0:0 translated to pair 0:0 (0,0)
0 translated to pair 0:0 (0,0)
---> LIST
<--- 150 Accepted data connection
Certificate verification: subjectAltName: `ftp.arm.beep.pl' matched
<--- 226-Options: -a -l 
<--- 226 6 matches total
---- Got EOF on data connection
---- Zamykanie gniazda danych
drwxr-xr-x    6 29634      nogroup           110 Dec 10 11:15 .
drwxrwxrwx    1 0          0                  22 Dec 10 11:20 ..
drwxr-xr-x    4 29634      nogroup            60 Nov 19 10:35 RZ 47 Jelenia G�ra
drwxr-xr-x    2 29634      nogroup            34 Dec 10 11:21 RZ 47 Wa�brzych
drwxr-xr-x    2 29634      nogroup            34 Dec 10 11:22 RZ47 ZIELONA G�RA
drwxr-xr-x    2 29634      nogroup            10 Dec 10 11:09 gfgdfg
copy: get hit eof
copy: waiting for put confirmation
FileCopy(0x2010040) enters state CONFIRM_WAIT
copy: put confirmed store
FileCopy(0x2010040) enters state GET_DONE_WAIT
copy: get is finished - all done
FileCopy(0x2010040) enters state ALL_DONE
---- attempt number 0
---> TYPE I
<--- 200 TYPE is now 8-bit binary
lftp arm-filezilla@ftp.arm.beep.pl:/filezilla.arm.beep.pl/test_polskich_znaków> cd gfgdfg/
lftp arm-filezilla@ftp.arm.beep.pl:/filezilla.arm.beep.pl/test_polskich_znaków/gfgdfg> ls
FileCopy(0x20352a0) enters state INITIAL
FileCopy(0x20352a0) enters state DO_COPY
---- attempt number 0
---- CWD path to be sent is `/filezilla.arm.beep.pl/test_polskich_znaków/gfgdfg'
---> CWD gfgdfg
<--- 250 OK. Current directory is /filezilla.arm.beep.pl/test_polskich_znaków/gfgdfg
---> TYPE A
<--- 200 TYPE is now ASCII
---> PASV
<--- 227 Entering Passive Mode (193,239,44,109,182,206)
---- Łączenie się (gniazdo danych) z (193.239.44.109) port 46798
---- Ustanowiono połączenie dla danych
0:0 translated to pair 0:0 (0,0)
0 translated to pair 0:0 (0,0)
---> LIST
<--- 150 Accepted data connection
Certificate verification: subjectAltName: `ftp.arm.beep.pl' matched
<--- 226-Options: -a -l 
<--- 226 2 matches total
---- Got EOF on data connection
---- Zamykanie gniazda danych
drwxr-xr-x    2 29634      nogroup            10 Dec 10 11:09 .
drwxr-xr-x    6 29634      nogroup           110 Dec 10 11:15 ..
copy: get hit eof
copy: waiting for put confirmation
FileCopy(0x20352a0) enters state CONFIRM_WAIT
copy: put confirmed store
FileCopy(0x20352a0) enters state GET_DONE_WAIT
copy: get is finished - all done
FileCopy(0x20352a0) enters state ALL_DONE
lftp arm-filezilla@ftp.arm.beep.pl:/filezilla.arm.beep.pl/test_polskich_znaków/gfgdfg>

comment:2 by Arkadiusz Miskiewicz, 10 years ago

Hmm,

"Status: Invalid character sequence received, disabling UTF-8. Select UTF-8 option in site manager to force UTF-8."

is likely the cause for the whole problem. Looks bad on filezilla side.

https://wiki.filezilla-project.org/Character_Set says:
"If an RFC 2640 compliant client sends OPTS UTF-8 ON, it has to use UTF-8 regardless whether OPTS UTF-8 ON succeeds or not."

and filezilla did send it:

Command:	OPTS UTF8 ON
Response:	200 OK, UTF-8 enabled

So it looks like filezilla is violating the rule above and that's the reason for the problem. Setting "force utf8" makes the problem go away.

comment:3 by Arkadiusz Miskiewicz, 10 years ago

The solution could be: if charset "autodetect" is used and if OPTS UTF8 ON was send then never automaticly switch utf8 off

comment:4 by Alexander Schuch, 10 years ago

Keywords: utf-8 added
Note: See TracTickets for help on using tickets.