One of the most pervasive misunderstandings surrounding async is that it makes things go faster. This is categorically false. If anything, async will actually make things slower. Why? Well, there's a lot of overhead required to run something asynchronously, and this overhead adds processing time.
So what does async do? Well, for that, first let's look at how synchronous processing works. I'll work with the example of web application here because it's the most immediately understandable to most people. Obviously, a web application will need a web server. A web server really just does one thing: it listens for requests on a certain port, usually 80 or 443 for SSL. When it receives a request, it responds. How it responds depends on the the type of the request and what other machinery is working under the hood. In the case of something like an ASP.NET MVC site, that machinery would be the MVC Framework. The web server passes the request to that Framework and waits for the Framework to tell it how to respond. Already, we've seen the word, "wait". That's going to be important in the discussion to come.
A web server is typically multithreaded, at least any production-capable web server will be, anyways. These threads determine the load the web server can handle. A single-threaded web server would be able to handle only one simulataneous request. In the period beginning when the web server receives the request and ending when it has a fully-formed response readied and sent, it cannot handle any further requests. They will be queued, meaning the client will have to wait in line until the web server has a free thread to handle the request. Obviously, because of this, the number of threads a web server has at its disposal is directly related to the load it can handle and its response time: the amount of time it takes the server to respond to the client's request.
However, threads are not free. They use a bit of resources including some portion of RAM and CPU cycles. If I/O work is being done, it can also need write time on a hard drive, etc., which has its own queue-type system. However, that gets a lot harrier, so we'll avoid that particular discussion for now. The point is that there is a very real ceiling to the amount of threads a web server can spawn. In a default installation, this will typically be set initially at 1000 in most web servers, and is often called the "max requests" of the web server, since it corresponds to the maximum number of requests the server can handle at any given time.
In a synchronous world, users 1-1000 do not have to wait in a queue. They are instantly handed off to whatever processing must be done to return their responses. Now, let's say that each of their requests requires some long-running process that prevents the server from responding instantaneously. Then, client 1001 is put into a queue, and must wait for one of the previous 1000 requests to complete. Client 1002 is likewise placed into the queue behind 1001, and must wait until a thread frees up for 1001, first, and then one for itself. Obviously, client 1999 is going to be waiting for a whle, as it must wait for 1000 requests to clear before its request can be processed.
Enter async. Asynchronous processing allows threads to be returned to the pool when they are not in use themselves but merely waiting on something else to finish. That part is important, because not every operation can be run async. It's all in that word, "waiting". If a thread is just sitting there, waiting on something, then it's a prime candidate for being made async. You'll often hear the term "long-running process" thrown about with async. I've done it myself above. However, it's often incorrectly applied as an excuse to run something asynchronously. Just because something takes a while doesn't mean it should be run async: the key factor is how much waiting is going on, not how long it takes overall.
The waiting is where async gives benefit, because using the previous example, if those 1000 clients call a web service and are just sitting there, waiting for the web service to respond before they themselves can respond, then clients 1001+ are being put into a queue for no reason. If they could only use one of those threads that's currently tapping its toes, then the server could clear out those requests a lot sooner.
That is what async does. It essentially says, hey, I'm just sitting here doing nothing while I wait for this thing to complete, so why don't you use me for some other task in the meantime. Essentially, it hands over the thread back to the pool, allowing the web server to handle another request on that thread. When the task it was originally waiting on completes, the thread is requested back, and once the web server has a free thread, the response is completed as normal.
There's a couple of key points to note in that. First, and again, I really want to emphasize the point, the thread must be in a wait state. If it's doing active work, it can't participate in async. This is why CPU-bound work must be synchronous: the thread is always necessary in order to complete work.
Second, using async may actually add to the server's response time for some requests. Let's say we have a client with a request that requires talking to another web service. That means the thread that client's request is being processed on has some period of time where it will be waiting, doing nothing. As a result, async has been utilized to allow that thread to be given back to the web server. In the meantime, the server gets slammed and all available threads are tied up, but the web service has finally responded to that original request. The client must now wait until a thread becomes available before its request can continue processing, just as if they had just got in line.
This is just one more reason why async does not equal "faster". It can and often will be slower. The benefit of async is that it allows the web server to use its available resources more efficiently, maximizing time that otherwise would be spent idling to do additional work instead.
One final myth I want to dispel is that async means no one has to wait. Although this has already partially been covered in the discussion above, it is still very possible to max out the server's available threads even if everything is running async. Further requests will then still go into a queue just as with synchronous processing. All async does is provide a little breathing room in most situations. It is not a magic bullet that instanteously fixes an under-performing server.
To close, do use async when you have code that will be waiting on something, but only if it's truly waiting. Just because it takes a long time to do something doesn't mean all of that time is spent waiting with nothing else going on. Things like CPU-bound work are never eligible, while things like calls to web services or I/O work such as writing or reading files from a hard drive are prime candidates to be handled asynchronously. You need to evaulate its use on a case-by-case basis. Remember that async is not free. It adds overhead to your application and should therefore only be used when it will actually add benefit. It will always be a trade-off, though. You're sacrificing raw processing speed for flexibility: the ability to handle additional work during off times. And, as with any trade-off, you need to evaluate whether or not it's worth it in any given application.