Tuesday, May 15, 2012

Highly Available Geo Redundancy with Outbound Send Connectors in Exchange 2003 and Later

This is something I’ve been meaning to write down for a while. I wrote an answer for this question to LinkedIn about a week ago and I’ve just emailed a MCM Exchange consultant with this – so here we go…

If you configure a Send Connector (Exchange 2007 and 2010) or Exchange 2003 SMTP Connector with multiple smarthosts for delivery to, then Exchange will round-robin across them all equally. This gives high availability, as if a smarthost is unavailable then Exchange will pick the next one and mail will get delivered, but it does not give redundancy across sites. If you add a smarthost in a remote site to the send connector Exchange will use it in turn equally.

So how can get get geographical redundancy with outbound smarthosts? Quite easily it appears, and it all uses a feature of Exchange that’s been around for a while. But first these important points:

  • This works for smarthost delivery and not MX (i.e. DNS) delivery.
  • This is only useful for companies with multiple sites, internet connections in these sites and smarthosts in those sites.
  • This is typically done on your internet send connectors, the ones using the * address space.

You do this by creating a fake domain in DNS. Lets say smarthost.local and then creating A records in this zone for each SMTP smarthost (i.e. mail.oxford.smarthost.local). Then create an MX record for your first site (oxford.smarthost.local MX 10 mail.oxford.smarthost.local). Repeat for each site, where oxford is the site name of the first site in this example.

Then you create second MX records, lower priority, in any site but use the A record of a smarthost in a different site (oxford.smarthost.local MX 20 mail.cambridge.smarthost.local).

Then add oxford.smarthost.local as the target smarthost in the send connector. Exchange will look up the address in DNS as MX first, A record second, IP address last), so it will find the MX record and resolve the A records for the highest priority for the domain and then round-robin across these A records.

If you have more than one smarthost in a site, add more than one MX 10 record, one per smarthost. Exchange will round-robin across the 10’s. When all the 10’s are offline then Exchange will automatically route to mail.cambridge.smarthost.local (MX priority 20 for the oxford site) without needing to disable the connector and retry the queues.

If you used servernames and not MX’s then it would round-robin amongst all entries, and so equally sent email to Cambridge for delivery. The MX option keeps mail in site for delivery until it cannot and then sends it automatically to the failover site.

Thursday, May 10, 2012

Starting Exchange When You Have Active Directory Issues

I had a call the other day from a company who had Exchange issues. One investigation it turned out they had a very suspect Active Directory and no-one would admit to what they had actually done to get it in such a state!

One server (DC1) would not talk to the other DC’s (Kerberos issues and replication issues) and the other DC’s where missing the Microsoft Exchange Security Groups OU and contained groups as well as other Exchange related stuff – though the schema and configuration was present!

DC1’s event logs where full of errors going back about six days (to when the issue started, though I only got a call a day before we had it fixed). But if I looked back in the log more than six days the event log showed only stuff from almost a year ago. I suspect a snapshot of the server was restored – but as I said, the only thing anyone claimed to have done was attempted to restore a user from a backup!

So the first step was to see if we could isolate DC1 from Exchange and do a setup /PrepareAD to replace the missing items in the domain naming context.

This requires limiting Exchange to DC2 with Set-ExchangeServer Exchange Management Shell cmdlet, but the shell would not start due to AD errors, so out with the registry editor.

To hard code Exchange to selected DC’s you need to visit HKEY_LOCAL_MACHINE\ SYSTEM\ CurrentControlSet\ services\ MSExchange ADAccess and create a new key called Instance0. Inside \Instance0 create a String called ConfigDCHostName that has a value of the FQDN of DC to use.

Then create a Profiles key under HKEY_LOCAL_MACHINE\ SYSTEM\ CurrentControlSet\ services\ MSExchange ADAccess\, which is the same location as before. Under Profiles create a subkey called Default. For Exchange 2010 create a DWORD called MinUserDC and a value of 1 and under Default key create two more keys called UserDC1 and UserGC1. MinUserDN is in a different location for Exchange 2007.

Inside UserDC1 key add a string called HostName (the value being the FQDN of the domain controller server to use) and a DWORD called IsGC with a value of 0.

Inside UserGC1 key add a string called HostName (the value being the FQDN of the global catalog server to use) and a DWORD called IsGC with a value of 1.

An example is shown in the picture for clarity:

image

Restart the Microsoft Exchange ADTopology Service to see if it can now connect to the correct server (the MinUserDC value stops Exchange attempting to connect to the PDC emulator as well as the listed domain controllers). In my clients issues, the PDC Emulator was DC1 that was effectively unreachable.

If you can get Exchange online now, great! Time to fix the issues with DC1. But if you can’t (and in my example I could not) then time for more troubleshooting – its sort of just like the MCM Qual Lab, just with real customer data!

To cut a long story short, in my example I decided that DC1 was the more accurate DC and that an authoritative restore of it to the last available AD backup (one month old!) might fix up the issues that had crept in since the abortive work done by the client earlier in the week. In this clients case, I used ntdsutil on DC2 to remove DC1 and then used dcpromo to demote all the DC’s so that they returned to member servers and standalone machines. Then I used ntdsutil to remove DC2 etc from the copy of AD on DC1 so that I was left with an almost up to date copy of AD on DC1. Then I rejoined DC2 etc. to the DC1 replica so I was back where the client thought they were with a number of DC’s but all replicating and Exchange objects all present. I needed to rejoin the servers to the domain, but once that was done I had a working Exchange environment. It was only six and a half days since the outage, and the clients email cloud filtering company held email for seven days – so no loss of email! Just about!

All in a days work for a Microsoft Certified Master | Exchange Server 2010.

Monday, May 07, 2012

Domain Redirection and BPOS to Office 365 Migration

With Microsoft Exchange Online via BPOS (the precursor to Office 365) you were able to configure a simple CNAME redirection to make access to OWA easier for your users.

For example, you could create a CNAME in DNS for mail.fabrikam.com (where fabrikam.com is your domain in BPOS) which pointed to go.domains.live.com and then when users accessed mail.fabrikam.com they would be redirected to the correct login page for BPOS. Note that this is also possible by setting the CNAME target to outlook.com.

A problem here though is that if you use go.domains.live.com then once you complete migration from BPOS to Office 365 the redirection stops working and when your users visit mail.fabrikam.com they get go.domains.live.com and not OWA!

So before you start your migration make sure you have changed your CNAME to outlook.com rather than go.domains.live.com and when users visit http://mail.fabrikam.com they get OWA both before and after the migration.

This recommendation also applies to academic organisations moving from Live@Edu to Office 365 for Education.

Thursday, April 19, 2012

Restricting Message Sizes in Exchange Server to Low Bandwidth Sites

Exchange Server has a series of different settings for controlling the maximum message size into and around an Exchange organization, but what about when parts of your organization have a considerably lower bandwidth than other parts, for example offices with servers in rural or hard to reach locations and require satellite WAN links or ships that are at sea.

For these and other examples it has been possible to limit the message size sent and from these limited bandwidth sites since Exchange Server 2007 SP1 by setting the MaxMessageSize property in Set-AdSiteLink

Set-AdSiteLink TitanicSiteLink -MaxMessageSize 2MB



Once an email is sent to a recipient in the target site Exchange Server (as part of the Categorizer component) determines the least cost route and sends the email. If the least cost route includes the site link on which you have limited your bandwidth then the email will be returned to the sender as an NDR if it exceeds the MaxMessageSize limit. If you only have one AD Site Link to your linited bandwidth site then Exchange routing will have to use that link. If you have more than one AD Site Link make sure they are all set to the limited size to that whatever the calculated least cost route is, the size limit will be enforced.



The only problem with this is that Exchange does not have the correct permissions within the Active Directory to be able to configure this setting. Therefore if you try the above Exchange Management Shell cmdlet it will fail with the following error:




Active Directory operation failed on dc-name. This error is not retriable. Additional information: Insufficient access rights to perform the operation.


Active directory response: 00002098: SecErr: DSID-03150BB9, problem 4003 (INSUFF_ACCESS_RIGHTS), data 0


    + CategoryInfo          : NotSpecified: (0:Int32) [Set-AdSiteLink], ADOperationException


    + FullyQualifiedErrorId : ADC691A4,Microsoft.Exchange.Management.SystemConfigurationTasks.SetAdSiteLink




The issue comes down to the fact that the Exchange Trusted Subsystem user account does not have permissions to the delivContLength attribute on the AD site link that you are trying to change. Therefore to make this setting in Exchange you need first to set the correct permissions in AD.



To set the correct permissions open Active Directory Sites and Services (if running Windows 2008 R2 or later) or ADSIEdit if using an earlier version of Windows. Expand Sites and Services to find Sites > Inter-Site Transports and right-click the IP container and choose Properties and change to the Security tab:



image



In ADSIEdit connect to the Configuration well known Naming Context and expand to CN=Configuration… > CN=Sites > CN=Inter-Site Transports and right-click CN=IP. Again select Properties and change to the Security tab:



image



Once in the Security tab click Advanced, click Add and type Exchange Trusted Subsystem. In the Permission Entry for IP dialog that appears once you click OK select the Properties tab and then select Descendant Site Link Objects in the Apply To box:



image



In this dialog find the Write delivContLength permission and click Allow.



Click OK enough times to close all the dialog boxes and windows and you have now granted Exchange the permission to set the MaxMessageSize property on any (and all future) AD site links that you have or may create.

Wednesday, February 22, 2012

Hosting Exchange 2010 and Issues With Duplicate Contacts

When you are creating a hosted Exchange system using the Exchange 2010 On Premises product (not the /hosting version of the product) it is likely that if two or more of your customers create a mail contact in the global address list (GAL) for the same external email recipient they will see some issues with email addressing.

For example, you are hosting Exchange for northwind.com and fineartschool.net within one Exchange organization. Both these companies have a professional relationship with greg@fabrikam.com and so want to create a contact for him in the GAL. The first of your clients to create the contact will be successful, but any future client receives the following error when they attempt to create the contact:

New-MailContact -Name "Greg (Fabrikam)" -ExternalEmailAddress greg@fabrikam.com -OrganizationalUnit FineArtSchool
The proxy address "SMTP:greg@fabrikam.com" is already being used by "isp.corp/Hosted/Northwind/Greg (Fabrikam)". Please choose another proxy address.
    + CategoryInfo          : NotSpecified: (…) :ADObjectId) [New-MailContact], ProxyAddressExistsException
    + FullyQualifiedErrorId : B333D21C,Microsoft.Exchange.Management.Recipient
   Tasks.NewMailContact

The work around is to specify a unique proxy address, as the default proxy address (the contacts actual email address) is already being used:

New-MailContact -Name "Greg (Fabrikam)" -ExternalEmailAddress greg@fabrikam.com -OrganizationalUnit FineArtSchool -PrimarySmtpAddress greg@fineartschool.net

Of course Greg’s email address is greg@fabrikam.com (his external email address) and not greg@fineartschool.net (his proxy or primary SMTP address so far as Fine Art School have configured) and if this client sends an email to Greg and they select Greg from the GAL it will go to his external email address but will look like it has gone to his proxy address. That is, Greg will receive the email but if he looks at the address it was sent to it will say greg@fineartschool.net.

Send an email to two people in external organizations, one being greg@fabrikam.com, and hit Reply All and Greg will appear as greg@proxyaddress and not greg@fabrikam.com. Emails in reply will go to Greg via the hosting company and not direct to Greg. This also has the side affect of showing presence (from Microsoft Lync) as being unavailable as the email is using the wrong email address.

The underlying problem is that though the email is being delivered to the external address (targetAddress attribute in Active Directory) it is being stamped with the primary SMTP address (proxyAddresses in Active Directory) in the P2 header. The P2 header is used to generate the Reply address.

So how do you fix this? The obvious way at first glance is to modify active directory and change the proxyAddresses value back to the correct value – but this does not work (as two objects cannot have the same proxy address). Regardless of the fact that the two mail contacts both have the same targetAddress and proxyAddresses, Exchange Transport detects a problem and reports the error “More than one Active Directory object is configured with the recipient address greg@fabrikam.com. Messages to this recipient will be deferred until the configuration is corrected in Active Directory” in the event log on the first Hub Transport server that sees the message.

So without writing your own transport agent, you need to route all outbound email via an Edge Transport server and configure the Address Rewriting agent. You need to create an address rewrite rule for every contact that is created within your hosted organization once the second contact is created. So in your mail contact provisioning application you need to trap the duplicate proxy address error above, reissue the mail contact creation step, this time with a unique primary SMTP address in the hosted clients domain and then at the same time make an address rewrite rule on your Edge Transport server.

New-AddressRewriteEntry -Name "Greg - Fabrikam - HosterFineArtSchool" -InternalAddress greg@fineartschool.net -ExternalAddress greg@fabrikam.com -OutboundOnly $true

Note that rewrite rules are cached for four hours, so unless you restart the MSExchangeTransport service your rewrite rules will not take effect until four hours have gone by.

Monday, February 20, 2012

Running Offline Web Applications from IIS Server

A feature of HTML 5 based applications is the ability to ensure that applications can still run even if internet connectivity is not present. How to do this is covered on the W3.org website.

A requirement of offline access is the creation of the offline cache manifest file. This manifest file is listed in the HTML tag on the page as such:

<html manifest="offline.appcache">



And a page is saved to the web server with the same name (offline.appcache in this example). This .appcache file follows the conventions described in the above W3.org web page, but this page needs to be served from the web server with a specific MIME type (text/cache-manifest). If the web server is IIS 5.0 or later then it will only serve content that has been listed as a valid MIME type in Windows. If you used a shared hosted webserver then making that change is probably impossible – so from IIS 7.0 or later you can add your own MIME type in the admin UI or modify the web.config file in the root of your web server to add this MIME type. This is just a text file that you upload and so requires no access to the IIS admin application (again, typically something you do not get with  shared hosted web server).



Note: In the example given below, the web.config file changes two properties. If you have an existing web.config file then merge these changes into your file and do not replace your file.



The web.config file needs to be as follows:



<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<system.webServer>
<staticContent>
<mimeMap fileExtension=".appcache" mimeType="text/cache-manifest" />
</staticContent>
</system.webServer>
<location path="offline.appcache">
<system.webServer>
<staticContent>
<clientCache cacheControlMode="DisableCache" />
</staticContent>
</system.webServer>
</location>
</configuration>



The two changes set in this web.config file are, firstly, mimeMap in the staticContent section of system.webServer. This adds the .appcache extension as text/cache-manifest. The second change is clientCache in staticContent section of system.webServer (but this time in a location section, limiting the effect of the setting to the named file – offline.appcache). This change stops the web server or client from caching the page, ensuring that the web server always serves the latest copy of the page.



Upload web.config and your appcache manifest file, along with any page that needs to be viewed offline (or indeed any page that you want to speed up loading for, by causing the pages to be cached on the client) and check that when you browse to the .appcache file directly in a HTML 5 aware browser it is visible. If you get a 404 error on this page then you have not set the MIME type or uploaded the correct web.config file.

Tuesday, January 24, 2012

HTTPS Load Balancer Issues with Exchange 2010 SP2

When you install Service Pack 2 (and maybe SP1 too) on Exchange 2010 it resets the SSL flag on the root directory of the IIS website. You might have removed this setting for a number of reasons, mainly to do with having a HTTP to HTTPS redirect, but it can also be removed if you are doing SSL Offloading to a load balancer and that load balancer checks the state of the client access server by doing HTTP requests for the root home page. The Citrix Netscaler is one such load balancer that has this as a default setting.

The configuration documentation for the Citrix Netscaler (found here) does not discuss changing the load balancer to use a different directory on IIS to monitor the availability of the site, so when you install SP2 for Exchange 2010 and that update resets the root directory to require SSL, your load balancer thinks the site is offline and does not pass through any traffic!

image

image

To fix this issue in the short term, just uncheck the Require SSL option on the root of the Default Web Site on each of your Client Access Servers. Your load balancer should notice within a few seconds and service will resume, for example the Citrix Netscaler checks the root directory via the monitor properties every five seconds for a HTTP success code (and not a HTTPS success code!).

To fix this issue in the long term you should make a new virtual directory on each server covered by the load balancer and get the load balancer to look at this directory to determine if the service is up or down rather than looking at the root directory. Your virtual directory will not be reconfigured by future Exchange service packs (or indeed any other application that you are load balancing that might reset the SSL option on the root directory).

To complete these steps do the following:

1. Create a folder in the inetpub directory called “monitor” or similar (in the examples below the folder is called “netscaler_monitor”).

2. Place an index.htm file in this folder that is a very simple webpage that when browsed returns the page. If you want to make the page more complex to include code (so that issues with the code are picked up by the load balancer then this is fine). A simple page would look like the following:

<html><head>
<title>Netscaler Monitor for Exchange 2010</title>
</head><body>
<p>This page returns a success code to the netscalers if IIS is running. This page must always work over HTTP and never require an SSL connection.</p>
</body></html>

3. In IIS require SSL and then uncheck require SSL – this forces a setting into the IIS config file (applicationHost.config) that says that this folder must always be over HTTP and not require SSL. If you do not do this then this folder will take the setting from the parent folder, and as we have already seen, this will cause the monitor folder to require SSL when you apply the service pack.

This SSL change will result in the following configuration at the bottom of applicationHost.config, which can be added directly to the config file rather than in IIS Manager.


    <location path="Default Web Site/netscaler_monitor">
<system.webServer>
<security>
<access sslFlags="None" />
</security>
</system.webServer>
</location>
</configuration>

4. Update your load balancer so that it has a new monitor for checking the service state on the managed machine. This monitor would be something like the following for a Citrix Netscaler, each load balancer being different. This monitor checks HEAD /netscaler_monitor/ and expects to get back a 200 status code. You need to change the folder name to match, but ensure the / is before and after the folder name.

image

5. Change the configuration for each client access server in the load balancer so that it uses the new monitor rather than the default HTTP monitor.

image

6. Save your changes to the load balancer. The next time you service pack Exchange 2010 the resetting of the SSL flag on the root directory will not cause you any issues.