Data transfer progress

Fma · 2002-07-17T13:35:30

Hi When sending a large chunk of data through a socket how can I keep track of how many bytes have been sent already.

Author

122

July 03, 2002 02:01 PM

Not sure but since each socket can only send 8kb at a time won''t it be faster if more than one socket is sending at the same time.

Waverider

169

July 03, 2002 02:10 PM

Are multiple sockets bottlenecked by the hardware they are using to send the data? (Meaning one socket is enough, loading up multiple sockets with data won''t increase the water pressure?)

It's not what you're taught, it's what you learn.

Anonymous

July 04, 2002 11:04 AM

Unless your connection operates at the full bandwidth/capacity of the hardware in your computer, this is jsut not possible

ie. Not unless you were on a connection that could exceed the 10Mbps that your network card could put through etc.
The bottleneck is in the OS (and network library?) that handles the buffering/implementation of TCP etc. The place where you might notice the less-than-best speed is in the sliding window implementation. If the connection is fast enough that the full window can be dumped before an ACK is received there will be some delay before it will start sending again, theoretically by opening a second socket you could reduce this by having two windows, i''m not 100% sure but i believe a seperate window is kept for each socket in sockets networking (rather than one per IP address).
I''m not sure if TCP (or sockets...) automatically increases the size of the window depending on transmission times etc.

a person

118

July 04, 2002 03:57 PM

using mulitple sockets is not a viable solution in any type of system. using a single socket on a 100Mb network i can transfer a 40MB file in approx 4secs. thus getting approx 99% efficency. this is over a switched network and not a hub. ussually the bottleneck is not the OS (just try using the loopback device at 127.0.0.1) but ussually the hardware its using. if you have a decent network card (ie 100Mb) and a hub. you are limited by the hub since the data gets replicated to all the machines on the network thus you get "shared" bandwidth. if just one pc on the network is 10Mb your entire network runs at 10Mb so the pc can keep up. a switch allows the network to be more efficent. but enough about that crap,

using send() on a 100MB buffer means the OS will split the data itself into chunks suitable for being sent in TCP/IP packets to be sent of the network. NO system that i know of supports 100MB packets. most use 1500 in an enviroment in a reliable internet enviroment (since eventually this app will be going there?). if routers on the network use a smaller pcaket size, your packets get fragmented anyway. so no matter what packet size you set, the router determines whether your allowed to send it. most routers are set to handle 1500 byte packets. modems use approx 576 because smaller packets are more reliable (since you get more acks, thus you dont need to resend has much if a packet gets lost. plus being on a dialup your transfer to the network is slow).

see below for window explainatin.

so if you wish to keep track of how much data is sent. you send the data using smaller buffers. 1KB is a decent size, 8KB is good as well. you can pick it based on the file size. just realize that the call waits till the data is sent, so using a 1KB buffer is best since even on a modem its unlikly to take more then 1 second (though send really returns after the data is placed on the TCP/IP stack thus as you call send, the data is already being transferred from the previous calls. so you ussually wont have to wait the entire transfer time per unit. instead it will happen in chunks in which you wil seem to "stall" bewteen groups of cals if you call send very fast).

again, you dont care about how the OS and TCP/IP stack handle the data. you sjut need to know that you send the data and it gets to the other side. you need to understand that splitting yoru send calls to use smaller buffers is more efficent because:
1. you dont need a 100MB file in memory.
2. not all pcs can hold an entire file in memory, but can store them on disk.
3. if you do allocate a large chunk, its limited to physical ram, andything larger then that will be placed in the swap file thus cause more disk usage. now the file is using the equivelent of two files storage space wise.
4. sending a 4gig files i the MAX you can send (i think send is limited to 32bit unsigned ints right? if it uses signed, or unsigned 16bit ints. you are limited to only sending 2gig. using the buffered approach you can send a file of any size, not caring about how much memory/swap disk space will be needed. an app sending files should not be resource intensive.
5. ussually sending over a very fast network is limited by harddrive speed.

in the end just split your send calls to smaller chunks that can be managed better. if i can get a 40MB file to transfer in only 4secs over a 100Mb switched network using a single socket i highly doubt dual sockets would help. if anything it requires the other side to make two connections (or each side to make a connection) and then requires a method to reorganize the data. sending mulitple files over multiple sockets may be faster, buts ending a single file over multiple sockets is just plain dumb. not only will you now deal with making sure the data is ordered correctly (not you add overhead to the data since you need to tell the other side which socket is containg what, weher in a single socket send you just send the file and close the connection).

TCP on windows and linux use a scaling window. thus slower transmission times result in the window being made smaller so things are more reliable (also retransmits are of smaller packets). the only thing that you can adjust through registry of config files is the initial window size. a large one menas you will start with an assumed fast connection, while a smaller one assumes a slower link. the window is then adjusted as the transmission goes along.

i think each socket may get a sperate window, but i am not sure either. also delays dont occer during send unless the other side can send enough acks to keep up. as you send packets (lets pretend 6 packets per window) the other side sends acks as fast as it can (ie for each packet).

so packet 1 is sent
packet 2 sent
packet 1 recv, ack 1 sent
packet 3 sent
packet 2 recv, ack 2 sent
ack1 recvd
packet 4 sent
packet 5 sent
packet 3 revc, ack3 sent
packet 4 recv ack4 sent
ack3 recv
packet 6 sent
ack4 sent
packet7 sent (since we will only send up to six packets without acks)
etc...

now:
packet 1 sent
packet 2 sent
packet 3 sent
packet 4 sent
packet 5 sent
packet 6 sent
waits for ack
packet 1 recv, ack1 sent
waits for ack
ack 1 recv, packet 7 sent
packet2 recv, ack2 sent
wait for ack
ack2 recv, packet 8 sent
packet 3, 4, 5 recv, ack 3, 4, 5 sent
ack 3, 4, 5 recvd, packet 9, 10, 11 sent
acks 3, 4, 5 revd, packets 12, 13, 14 sent

you getting the idea? there is no delay if the other side can send acks as soon as the packets are recieved. the idea was that acks may tack xms to make it back. so instead we send packets as fast as possible until we have sent x packets that have no corresponding ack. we dont do it in chunks like you seem to think. ie we dont send 6 packets, wait for 6 acks, then send another 6, etc. we send 6, wait for an ack, then send a packet for each ack until we have 6 unacked packets. making sense?

all this though as nothing to do with writing a netowrk app. just send using smaller buffer sizes and you can keep track of percentage done.

Anonymous

July 04, 2002 09:44 PM

Like i said, if the full window can be dumped before an ACK is reveived, the connection possibly won''t be operating at max efficiency.
Comparing over a LAN is not very effective since the latency is almost negligible.

a person

118

July 05, 2002 02:49 AM

i see your point a bit better AP, teach me to respond to posts when i am a wee bit tired, heh.

anyhoo. being that as it may, mulitple sockets will only add congestion that will further stress the link and is not a good idea. windows and linux both handle auto tcp window sizing so the programmer need not worry about it. the buffer size used in send() means virtually nothing since its all grouped together on the TCP stack.

udp on the other hand would require proper sized packets since each call send a single packet of with a data load the size of the buffer being sent.

an interesting visual demo would be to simulate an entire network topology testing these things. i know its been done since the 70s, but still would probably look cool and could be interesting from an ai point of view. maybe even a puzzel game that has the player building a network with a variety of servers and line speeds. each puzzle would give you a limited number of lines ( no length) switches/hubs/token rings, routers and servers. then the player must try to create a network topology that allows a client connected at any one of the servers send a file to anyone of the other servers within a certain time. things like simulated packet loss, latency, and other oddities of the net could be implemented. i guess its highly off topic, but the idea popped into my head and figured i would share. heck it may be the perfect "game" to show ppl the problems associated with networks like the internet. also the benefits.

Fma

Author

122

July 05, 2002 05:33 AM

Ok lets say that the client had sent 10 packets each packet is 1kb. Like so:
Send packet [1]
Send packet [2]
Send packet [3]
Send packet [4]
Send packet [5]
Send packet [6]
Send packet [7]
Send packet [8]
Send packet [9]
Send packet [10]

How will these packets arrive to the server, what mean is each call to the send function is dependent from the other and requires a call to the recv function to receive the data that was sent by it, or they all end up at the same buffer (socket buffer), for example could this happen when calling the recv function 10 times:

Received packet [1]
Received packet [2]

Received part of packet [3]
Received the rest of packet [3] and packet [4]

Received packet [5]
Received packet [6]
Received packet [7]
Received packet [8]
Received packet [9]
Received packet [10]

Anonymous

July 05, 2002 11:17 AM

Yes, you will have to re-assemble the data at the other end, and you may not completely fill the buffer each time you call recv. Think of it more as a stream than a set of chunks on the receiving end.

Fingh, I don''t understand your comment about chunks being dropped due to size if this guy is using TCP/IP. TCP/IP is stream based so the size he feeds the data at is largely irrelevent. If we were talking UDP, then sure, but reliable messaging over UDP is a whole different can of worms and certainly not something I''d pick for doing file transfers.

Fma

Author

122

July 07, 2002 11:43 PM

Thanks guys.

fingh

142

July 08, 2002 04:34 PM

quote:
Original post by Anonymous Poster
Fingh, I don''t understand your comment about chunks being dropped due to size if this guy is using TCP/IP. TCP/IP is stream based so the size he feeds the data at is largely irrelevent. If we were talking UDP, then sure, but reliable messaging over UDP is a whole different can of worms and certainly not something I''d pick for doing file transfers.

TCP/UDP doesn''t matter. Packets still get dropped and chopped at routers. The difference is that TCP handles that for you in the background, while with UDP you have to handle it yourself. Typically larger packets are more costly when dropped. You have to find the point of diminishing returns on size. You don''t want to send such small packets that your headers comprise the bulk of your bandwidth, but make them small enough that the routers won''t continuously fragment them.

Data transfer progress

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Data transfer progress

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines