What's the safest way to get this data to the outside?
August 7, 2015 2:00 PM Subscribe
I have a database table with sensitive (read as personal, non-financial) information inside our office locked-down network that I need to expose to users on a website. What's the best way for me to expose this data without exposing us to unnecessary risk?
I have a 200MB-ish table of 200k-ish rows of data associated with users, that we want to expose to users on a website so that users can edit their data without us having to be involved - although we will still be involved in updating as well, so it'll be a two-way sync. The problem is that the data in question is currently kept within a database server that's within our office network, which is locked down (and appropriately so).
This is the start of us likely doing this with more data and databases that are currently internal-only. Nothing that will likely be in the tens of gigabytes of size, but I wouldn't be surprised to get a gigabyte-sized table get involved in the future.
All solutions seem to be revolving around Azure at this point, as the company wants to start expanding to use Azure in other things as well (outside of me who spends his days in Linux, everyone else here is MS born and bred). The database in question is a SQL Server 2008 R2 instance.
My idea was to use a virtual network to extend our internal network to include our Azure account, and then write a web API hosted within Azure to access the database so that we can lock down precisely what data we can grab from it (locked down SQL user that can only access a particular view, and all interactions happening with stored procedures). Then open up just port 443 on this Azure web API instance/virtual network, encrypt the transfer, and lock down access to that point to only come from a non-public IP from the website in question, which will be the only IP that will need to access this data.
There's some pushback on this idea though - I think because of opening 443 on the Azure virtual network. Our senior dev wants to use SQL Sync Agent to synchronize the table up to Azure over an encrypted connection, which then would be in the cloud and exposed, but since the data is encrypted up to the cloud (and synced), we can query the database directly from the website over a secure connection. With the SQL Sync Agent, no additional ports would need to be opened, but the database table would sit out on the cloud.
My background is in web, so I tend to just want to make APIs out of everything. I feel like I'm right in this situation (or at the very least both of our options are valid), but our senior dev is a smart guy, and his resistance to my method means I end up feeling like I'm missing something obvious in both sides of this. Am I?
I have a 200MB-ish table of 200k-ish rows of data associated with users, that we want to expose to users on a website so that users can edit their data without us having to be involved - although we will still be involved in updating as well, so it'll be a two-way sync. The problem is that the data in question is currently kept within a database server that's within our office network, which is locked down (and appropriately so).
This is the start of us likely doing this with more data and databases that are currently internal-only. Nothing that will likely be in the tens of gigabytes of size, but I wouldn't be surprised to get a gigabyte-sized table get involved in the future.
All solutions seem to be revolving around Azure at this point, as the company wants to start expanding to use Azure in other things as well (outside of me who spends his days in Linux, everyone else here is MS born and bred). The database in question is a SQL Server 2008 R2 instance.
My idea was to use a virtual network to extend our internal network to include our Azure account, and then write a web API hosted within Azure to access the database so that we can lock down precisely what data we can grab from it (locked down SQL user that can only access a particular view, and all interactions happening with stored procedures). Then open up just port 443 on this Azure web API instance/virtual network, encrypt the transfer, and lock down access to that point to only come from a non-public IP from the website in question, which will be the only IP that will need to access this data.
There's some pushback on this idea though - I think because of opening 443 on the Azure virtual network. Our senior dev wants to use SQL Sync Agent to synchronize the table up to Azure over an encrypted connection, which then would be in the cloud and exposed, but since the data is encrypted up to the cloud (and synced), we can query the database directly from the website over a secure connection. With the SQL Sync Agent, no additional ports would need to be opened, but the database table would sit out on the cloud.
My background is in web, so I tend to just want to make APIs out of everything. I feel like I'm right in this situation (or at the very least both of our options are valid), but our senior dev is a smart guy, and his resistance to my method means I end up feeling like I'm missing something obvious in both sides of this. Am I?
Best answer: so are the different options:
if so, pushing is the solution that i have seen used elsewhere. it keeps the internal network as safe as possible.
posted by andrewcooke at 2:58 PM on August 7, 2015
- push the data to the cloud (your senior dev)
- pull the data to the cloud (you)
- place all data in the cloud (primethyme)
if so, pushing is the solution that i have seen used elsewhere. it keeps the internal network as safe as possible.
posted by andrewcooke at 2:58 PM on August 7, 2015
To clarify, my answer was intended to include "push the data to the cloud" as an option. At least that's what I meant by including the option of replicating or moving. However, if the purpose is to let your customers/users update their own data, and then have those updates be used internally, you can't simply push upward to the cloud. You also need to pull the changes back down. And deal with any discrepancies, delays, etc. It adds more complexity and fragility. A single source of truth makes these things a lot easier and saves work. And if the data is already in the cloud anyway, it doesn't buy you a lot to also keep a copy internally (unless you only need a subset in the cloud).
posted by primethyme at 3:13 PM on August 7, 2015 [2 favorites]
posted by primethyme at 3:13 PM on August 7, 2015 [2 favorites]
retaliatory clarification: i didn't include the other direction just to simplify the descriptions, but the argument still holds (you generally push from and pull towards the safe network).
i do agree that the trade-off is synchronisation issues.
posted by andrewcooke at 3:22 PM on August 7, 2015
i do agree that the trade-off is synchronisation issues.
posted by andrewcooke at 3:22 PM on August 7, 2015
I am a software engineer, but I am not your software engineer.
I would have a separate domain (named ex. External or etc) with a trust relationship to your internal domain. Doesn't matter if it's azure or not. Your external endpoint should have as little data as possible stored in it. Construct an API on external to shuttle back and forth between your users and your current DB. Attempt to eliminate any databases on your external endpoint.
posted by boo_radley at 4:41 PM on August 7, 2015
I would have a separate domain (named ex. External or etc) with a trust relationship to your internal domain. Doesn't matter if it's azure or not. Your external endpoint should have as little data as possible stored in it. Construct an API on external to shuttle back and forth between your users and your current DB. Attempt to eliminate any databases on your external endpoint.
posted by boo_radley at 4:41 PM on August 7, 2015
Response by poster: Thanks everyone for the food for thought. Chalking this one up under the "missing something obvious" list.
posted by chillin411 at 8:44 AM on August 10, 2015
posted by chillin411 at 8:44 AM on August 10, 2015
« Older Please recommend your favorite 7 passenger vehicle | Help me understand Intravenous re-hydration? Newer »
This thread is closed to new comments.
I would not poke a hole into your office network for this, even if it's really locked down. That's not a great architecture from a performance, reliability, or security standpoint. I would either move or replicate the data that is needed into a database instance inside of Azure. Yes, having the database "in the cloud" adds some risk, but it removes other risk, and if you do it correctly it should be adequately secure. I'd personally be more worried about viruses coming in on employee laptops and getting at your internal network than people getting at a properly-secured database in Azure.
Keep in mind "the cloud" is how the vast majority of new technology companies are building their stuff, and the direction many older companies are moving. It's not inherently insecure. You do need people who know what they're doing to configure the systems and write the code. But you need that either way, if you're going to be opening up an application to the world.
posted by primethyme at 2:21 PM on August 7, 2015 [2 favorites]