Advertisement

some basic socket questions

Started by February 26, 2017 06:34 AM
20 comments, last by Helo7777 7 years, 8 months ago
I haven't done any socket programming ever before now, so i am pretty much at the novice level and these questions below might be considered pretty much basic for most here but the fact remains that at the moment i have no clue to these questions, results on google search has not been specific to my need, so i'm hoping someone would enlighten me.
Aim: to be able to programmatically send a jpeg image file both directions between Android client and my computer (server)
computer IP and port as server.
Info on my code so far:
  • server side code --> Java (my PC)
  • client side code ---> Android/Java
  • Using Router Wireless Network

so my questions-

  1. do I use public (external) OR internal IP?
  2. UDP or TCP?
  3. which Port to use, ESTABLISHED or LISTENING?
  4. Also if I don't want to hardcode size allocation for the image to be received at the server end (and I shouldn't) - whats is the robust alternative to allocating image size to this hard-coded code ?

int filesize=6022386;
byte [] mybytearray  = new byte [filesize];

^

5. and I don't even know, - is this default in bits or bytes? because the average size I would be sending (for this test run) is 500kb. So 500,000 bytes or 500,000,000 bits?

Thanks

can't help being grumpy...

Just need to let some steam out, so my head doesn't explode...

You don't get control over "ESTABLISHED" or "LISTENING" for TCP connections in netstat.

When your server calls bind() and listen(), it makes the socket LISTENING.

When a client then connects to that socket, a new socket is returned by accept(), and that socket is in the ESTABLISHED state.

If both your cell phone and your PC are on the same wireless network, then you should use the internal (network-level) address -- typically, 192.168.x.y.

If your cell phone is outside your network, you need to set up port forwarding from your router to your PC, and you need to connect to the external IP / port (that the router then forwards.)

enum Bool { True, False, FileNotFound };
Advertisement

Thanks @[member='hplus0603'], you've cleared up some issues for me

What of questions 4. and 5. ... can you help answer these also?, thx

can't help being grumpy...

Just need to let some steam out, so my head doesn't explode...

4. Send an integer indicating the file size first. Then allocate the byte array. Then receive that many bytes.

5. All of the APIs I'm aware of operate in bytes.

4. Send an integer indicating the file size first. Then allocate the byte array. Then receive that many bytes.

5. All of the APIs I'm aware of operate in bytes.

Many thanks! :D

I will have to use Bitmap.getByteCount () at the client side. I had initially thought there is a way to do this at server side without having to send data twice from client

can't help being grumpy...

Just need to let some steam out, so my head doesn't explode...

Remember that anyone can connect to your server, not just the client you wrote. It would be slightly more work, but in theory the same applies to the server. Even if it is your client and server communicating, there could be bugs (in your own code or even in the image library you are using) or version mismatches ("old" client connects to a newer server).

For these reasons, it is really important that you take lots of care when reading / writing code that reads data from the network. If you want to prefix the image data with the length, make sure that you validate that the resulting data that you are reading is exactly the same length. Failure to do so can result in your program crashing or in security issues, a recent example is "Cloudbleed" (https://en.wikipedia.org/wiki/Cloudbleed).

As for your last statement, there are ways to do this, but they involve different tradeoffs.

If you send the entire image as a single UDP packet, then the size of the data is already in the UDP header. However, it isn't recommended to send very large packets over the public internet. In addition, with UDP you will have to deal with packet loss and duplication.

With TCP, you can have a buffer that can be resized, and you can read all bytes into it until the TCP connection is closed (assuming this is a "one shot" connection) or until some safety limit is reached (e.g. 1Mb, 10Mb, whatever it takes to stop a malicious client sending excessive data and crashing the server). If you want to keep the connection open in order to send multiple images, or for other data transfer, then you'll already need some way to break up different messages - with a length being a common mechanism.

Another option is to implement the image upload / download as a separate process from the main server code. This would allow you to use an existing piece of software, for example a standard HTTP server - and the client could use standard HTTP libraries to interact with it. This could avoid some of the difficulties described above.

Advertisement

When it comes to "size" there are two dimensions and two measurements.

1) "kilo" might mean 1000 (10**3) or 1024 (2**10)

2) "b" might mean "byte" or "bit."

Traditionally, in computer engineering, when talking about data amounts, "kilo" has always meant 1024. However, when it comes to storage (hard disks,) and throughput (networks,) the marketing people have started using 1000, because that allows them to claim a higher number of kilo-somethings. Thus, it's important to know which particular convention is used when you look at the number. Some computer people have taken to calling the 1024-based kilos "kibis" -- "1 kibibyte" would be 1024 bytes. It's not a very widespread phenomenon, but if you see it, you will know what that means.

"Bits" versus "bytes" also matters in throughput. An internet service provider will almost always specify "b" as in "bits" -- and it takes between 8 and 10 bits to send a "byte" (depending on the overhead of the communication medium.) Actually it can take even more, if you're talking on very lossy channels. Engineers who care to make the distinction, usually use small-b for bits, and big-B for bytes. So, 1 kB is 8 kb, and might take as much as 10 kb of bandwidth. But a lot of people forget, or don't know, or otherwise get it wrong, so again, it's important to make sure you understand where the number came from, so you know how to interpret it.

Finally, when it comes to sending and receiving data, any protocol will generally use a little bit of framing around the actual data. Typically, you'll see a "type" field, followed by a "length" field, followed by the actual data. Each of those "type" and "length" fields may have different encodings -- single bytes, 4-byte integers, 8-byte integers, variable-length integers, text-mode encoding all exist. And, for the integers, little-endian, and big-endian encodings both exist.

Some encodings use a "terminating value" instead of a "prefix length" -- e g, they send data - data - data - data - terminator. The other end then should stop expecting more data when seeing the terminator. However, if the data could contain the same byte values as the terminator (often, a single 0-byte is a terminator) then this creates an ambiguity, which breaks communication, and causes bugs (and even zero-day security vulnerabilities.) I strongly recommend against using a terminator.

Make sure that you validate the data you receive. If a type value comes in, and it's not one you recognize, emit an error and close the connection. If a length value comes in, and it's not reasonable for the type value (too short, or too long,) or if it's bigger than you can easily allocate (some maximum limit you impose for safety,) then emit an error and close the connection. If you spend more than a generous amount of time waiting for data without receiving it, or if the throughput of the data is lower than you'd reasonably expect, time out, emit an error, and close the connection. All of these are both so that you can find bugs in your own code, and so that you can avoid certain kinds of bad behavior / attacks by people on the internet. For example, a popular way of trolling a server is to open a connection, send a single byte, and then send nothing more. The other end will wait for a long time for the next byte, which will take up server resources. And, if there's a timeout between bytes, another attack is to open a connection, and send one byte every 9 seconds. This will avoid the timeout, yet tie up resources that aren't useful. (This is why a throughput gate, or a maximum-transaction-timeout, is useful.)

Finally, when you call recv(), you have to realize that you may receive anything between 1 and full-buffersize bytes. The machinery between the sender and the recipient will re-package the data being sent, so even if the sending side calls send(128), the receiving end may not see that as a single chunk of 128 bytes, but might instead get one chunk of 1 byte, one chunk of 126 bytes, and the final byte might be the first byte of some next chunk that's sent because the sending end sent more data afterwards. This is because TCP is a "stream" that doesn't recognize "packet boundaries," and thus you usually do best with the known-size type and size fields before variable-sized data.

Good luck with your project!

enum Bool { True, False, FileNotFound };

Remember that anyone can connect to your server, not just the client you wrote. It would be slightly more work, but in theory the same applies to the server. Even if it is your client and server communicating, there could be bugs (in your own code or even in the image library you are using) or version mismatches ("old" client connects to a newer server).

For these reasons, it is really important that you take lots of care when reading / writing code that reads data from the network. If you want to prefix the image data with the length, make sure that you validate that the resulting data that you are reading is exactly the same length. Failure to do so can result in your program crashing or in security issues, a recent example is "Cloudbleed" (https://en.wikipedia.org/wiki/Cloudbleed).

As for your last statement, there are ways to do this, but they involve different tradeoffs.

If you send the entire image as a single UDP packet, then the size of the data is already in the UDP header. However, it isn't recommended to send very large packets over the public internet. In addition, with UDP you will have to deal with packet loss and duplication.

With TCP, you can have a buffer that can be resized, and you can read all bytes into it until the TCP connection is closed (assuming this is a "one shot" connection) or until some safety limit is reached (e.g. 1Mb, 10Mb, whatever it takes to stop a malicious client sending excessive data and crashing the server). If you want to keep the connection open in order to send multiple images, or for other data transfer, then you'll already need some way to break up different messages - with a length being a common mechanism.

Another option is to implement the image upload / download as a separate process from the main server code. This would allow you to use an existing piece of software, for example a standard HTTP server - and the client could use standard HTTP libraries to interact with it. This could avoid some of the difficulties described above.

Deep stuff, many thanks

I have also book-marked this ... once I get over the basics, I will dig and research more into this

At the moment my code is crashing at the point where I supplied the IP and port. What's wrong with the code below, does anyone know

What am I doing wrong, how do I decide which port to use (after netstat -a) among the displayed in command line?


      private String serverIP = "192.68.1.2";
      ....
      ....
      send.setOnClickListener(new View.OnClickListener() {

            @Override
            public void onClick(View arg0) {
            
                Socket sock;
                try {
                    sock = new Socket(serverIP, 45210);   // crashed here
                    System.out.println("Connecting...");

                     // sendfile
                          File myFile = new File (selectedImagePath); 
                         byte [] mybytearray  = new byte [(int)myFile.length()];
                          FileInputStream fis = new FileInputStream(myFile);
                          BufferedInputStream bis = new BufferedInputStream(fis);
                          bis.read(mybytearray,0,mybytearray.length);
                          OutputStream os = sock.getOutputStream();
                          System.out.println("Sending...");
                          os.write(mybytearray,0,mybytearray.length);
                          os.flush();

                        sock.close();
                } catch (UnknownHostException e) {
                    // TODO Auto-generated catch block
                    e.printStackTrace();
                } catch (IOException e) {
                    // TODO Auto-generated catch block
                    e.printStackTrace();
                }
            }
        });
    }

can't help being grumpy...

Just need to let some steam out, so my head doesn't explode...

Crashed - is there an error message?

You shouldn't need to look at netstat to pick a port. Ports exist when someone tries to use them. Picking ports based on netstat is actually a bad idea, as those ports are already in use! However, using netstat might help you understand what is happening after you have successfully picked a port and started a server on it, or why a port isn't available to use - you might see the process that is already using that port.

That looks like a client, a client needs to "know" the port the connect to - in the same way it needs to know the IP address or DNS hostname.

For example, my web browser will connect to www.gamedev.net on port 443, as that is the default port that HTTPS uses. If we used HTTP, it would be connect to port 80. Browsers allow a custom port to be specified like so: http://www.example.com:8080, browsing to this would attempt to use port 8080.

For your custom game protocol, typically you choose a custom server port. The exact number is not important, but ideally you'd avoid using common ports, this avoids confusion, reduces the changes that unexpected clients connect to the server and reduces the probability that the port is already in use by another program when the server wants to start. Some light Googling indicates that port 45210 doesn't seem to be associated with any well known software, so that should be a fine choice. Advanced server software often allows the server admin to choose an alternative port by changing a configuration file or passing a command line parameter.

Once the server is running on that port, the client needs to have a "know" where the server is. While you're learning, you can simply hard-code the IP address and port numbers to use. An intermediate step might be an input field that allows the server to be changed at runtime. A more advanced step later again might be to have a discovery service that allows clients to find servers.

Can't you use something like Kryonet instead of doing everything from zero? https://github.com/EsotericSoftware/kryonet

Also you need to either use try-with-resources if your target Android version supports it, or use finally blocks to release all the streams/sockets you're opening. Otherwise if something goes wrong, you wont be closing anything because the socket/stream .close() methods wont be reached.

And read the stack traces you're printing. That's what they're for.

"I AM ZE EMPRAH OPENGL 3.3 THE CORE, I DEMAND FROM THEE ZE SHADERZ AND MATRIXEZ"

My journals: dustArtemis ECS framework and Making a Terrain Generator

This topic is closed to new replies.

Advertisement