Rocket Genius is the company behind the wildly-popular Gravity Forms form builder platform for WordPress. Initially they had moved their main marketing site to Pagely to “keep it in the WordPress family” but quickly realized that Pagely’s support was a huge asset. When they encountered scaling issues with their mission critical licensing server still running at their former host, it was a no-brainer to bring it over to Pagely to have us sleuth out the performance culprits.

Fast forward five years and Rocket Genius remains a happy customer with Pagely having successfully navigated various scaling challenges and grown its user base considerably. In this interview we talk through eight specific challenges that we helped resolve in our relentless pursuit of improving their performance while optimizing their costs. Enjoy this conversation with Alex Cancado of Rocket Genius. If you have a question for Alex you can ask it below via a comment. And if you’re wrestling with a similar scaling challenge of your own get in touch with us and we’d be happy to help.

Show Notes

TimeTopic
0:01:12Welcome and context
0:02:20What is Gravity Forms?
0:04:22Backstory on Rocket Genius becoming a Pagely customer
0:06:15What was the issue you were experiencing with your previous host that prompted you to seek out Pagely?
0:07:37Challenge #1: Needing SSH to connect desktop and server-based tools to the sites
0:10:46Not just hosting Gravity Forms’ marketing site but solving scaling issues for the plugin itself
0:12:39Challenge #2 Challenges at Firehost with Gravity Forms licensing server
0:15:14Migrating and load testing the GF licensing server for capacity planning to ensure a smooth transition
0:19:19Switching from CPU-optimized RDS to memory-optimized for max performance/cost ratio given the circumstances
0:21:41Amazon’s performance insights tool for getting granular New Relic-like insight into MySQL and Maria DBs
0:24:24How did that cutover process of bringing the licensing server live on Pagely go?
0:26:25Challenge #3: Locking down XML-RPC for security purposes without breaking Jetpack
0:30:17Challenge #4: 503 memory errors and using the Pagely monitoring endpoint
0:34:38Challenge #5: Brute force password hacking and Pagely’s paper trail tool
0:37:07Challenge #6: SSL configuration issue with external service Cloudflare
0:39:58Challenge #7: Running Discourse for forums as a subdirectory of the WP instance for SEO purposes
0:46:47Challenge #8: Outages due to unoptimized query calls to a Gravity Manager plugin
0:52:12Wrap-up

Links mentioned in the interview

GravityForms.com – the Gravity Forms marketing web site.
GravityHelp.com – their licensing server.
WRK load testing app – tool Pagely used for emulating live traffic
K6 – load impact testing tool we now use
AWS Performance Insights – inspection tool for getting granular performance insight
Jetpack’s Vaultpress – service for on-demand backups of your WP site
OpenResty gateway – the foundation that Pagely’s ARES gateway is built upon
NGINX – web server that Pagely prefers to use
Help Scout – support system that Rocket Genius uses

Transcript

Alex Cancado: 00:00 One of the coolest things about working with you guys is that a lot of times we making changes and we make a mistake and you guys see it before we see it. Even though we were right there and making the chain like, okay, we made the chain reply something and then pages like, hey guys, there’s something going on here.

Arman Zakaryan: 00:16 Rather than no load testing phase, we were able to get the right size so that they have enough power to handle with they what their site demands are. The licensing and kind of have those support. Um, but not be overspending.

Alex Cancado: 00:29 I mean this type of problem with it being a big problem. We’ve been talking about a hosting that is not willing to go in there and dig and find a specific reason why you haven’t an issue.

Arman Zakaryan: 00:38 It’s a perfect opportunity for them to say, oh no, you need more hardware. We’ve got a double. You’re a triple your spend on your hardware. Right? It’s a chance for them to both things, right? We operate a little differently. We want to solve what the root problem is because it actually in the long run it’s just easier to support you.

Alex Cancado: 00:52 So really I have, I have nothing but very good things to say and it’s mostly because of support.

Sean Tierney: 01:06 Okay. We’ll get started. Uh, today we are joined by Alex from rocket genius. Alex, welcome to the podcast Webinar or whatever we’re calling this and the video case study.

Alex Cancado: 01:18 Hi, very nice led to be here, Sean.

Sean Tierney: 01:20 Great. So this is the third one of these we’ve done. Um, I’ll just kind of tee up the conversation for the people that are tuning in for the first time here. Um, so what this is, is it’s, it’s a video case study. We’re going to talk through some challenges that you guys have had with us and how we address those challenges. And the goal here is really to just surface anything that might be useful for the folks listening to see how we dealt with various challenges to see how we operate. Um, and just to get, you know, get a feel for what this is all about and hopefully they’ll learn a little bit about your business in the meantime. And, uh, you know, gravity forms. I’ll, I’ll let you kind of introduce yourself, but I can tell you I been a longtime user of gravity forms. You guys have built something that’s extremely cool because it’s like a platform on wordpress. It’s, it’s arguably the best form builder thing that’s out there, but it’s also more than that. People have built survey tools and all kinds of integrations. So it’s a really cool product. And uh, yeah, we’re excited. We were happy to host you guys and excited to have you on the show. So what is, how do you describe gravity?

Alex Cancado: 02:21 No. So yeah, so gravity forms this in a nutshell. It formed the other two, you know, we designed it thinking that would be used for people to develop contact forms and then this thing cannot transform, like you said, into a platform where, you know, you see all kinds of different types of forms that people are developing. So basically anything that requires user interaction on a website, uh, you know, people are using gravity forms for a, we have a lot of what we call add ons, which are extensions to the, to the basic product that either adds functionality or integrates gravity forms or third party, you know, like MailChimp, stripe, paypal and things like that. Um, and on top of that, we have a very nice Api with hooks and uh, and classes and filters that people can use to expand gravity forms to, you know, to do pretty much whatever they want.

Sean Tierney: 03:18 So basically it Pagely pretty extensively. I set up with the tool for our sales team where on every call we’re taking notes and we’ll use the conditional logic and then the gravity form. So, you know, depending on what they are and who they’re doing or you know, who they are and what they’re doing, we can then pop out other flouts and get very detailed in terms of structured note taking. And then that pipes it through Zapier and we do a bunch of stuff in the CRM. So it’s really super cool. Well, so I think what I’ll do, oh, and then also on the call, uh, is our fearless head of Dev ops Armaun uh, as a Korean. Thanks for giving our mind anytime. Uh, so what we’re going to do now is I’ve got kind of a timeline here, so I think the best thing to do is just kind of go back and chronologically step through, uh, the various challenges and solutions that we’ve, we’ve done with you guys. So, so let’s start. So you guys became a customer, it looks like in August of 2014, which is actually before I started at Pagely, but we brought you over and it looks like we brought to two different sites over. We brought gravity forums.com over and then shortly thereafter we brought gravity help.com.

Alex Cancado: 04:29 Yeah. So, so the way down, what was, uh, uh, you know, we’ve had a relationship with occasionally for, for a while before we even became customers, but we wanted to, we’ve always had two sites. We handle our marketing side, gravity forms.com and we’ve had our support sites, gravity, health.com that handle support requests. And also handled, um, licensed key validation, you know, Api. Um,

Alex Cancado: 05:00 the gravity forms.com site is way simpler, just a marketing sites. We have some static content, you know, simple reverie help upside on the other hand, much more complicated. It has a lot more traffic because all the installations are pinging this site to validate license keys on top off people coming into for documentation and people coming into to request support. Um, so what we did, um, uh, is we moved initially we just move gravity forms or come to Pagely and we kept gravity health.com with our existing hosting just to kind of test the waters and see if things would work with Paige late. Um, so we did that and it was a great experience with liked it. Performance was unbelievable. And then we were like, okay, so now it’s time to move gravity help.com, which that was a big deal. It’s uh, you know, it was a big sites, there’s a lot of traffic and we were already having some, some sort of performance, you know, some beginnings of performance issues on the hosting that we were. And we had a lot of concerns.

Sean Tierney: 06:08 Let’s, let’s step back to the initial need that prompted you to seek us out. Like what was the issue that you’re experiencing with your previous hosts that, that made you want to bring gravity forms. Dot column to page.

Alex Cancado: 06:22 So gravity forms.com it was not, we were not having any issues. We just want it. You, uh, like I said, we knew, you know, Josh, we wanted to give Pagely base good chance. So we wanted to keep, um, you know, things within the wordpress community and we want it to support, you know, a patient who was doing, and that was the main motivation behind wherever the forms.com. Okay, cool. Now after we’ve gone through that experience, then gravity help.com that was an actual, you know, need. We were having problems with our existing hosting. We didn’t, we didn’t feel like we’re getting the level of support that we wanted. And we were comparing against the support that we got with gravity forms.com and we’re like, man, we just got up, we got to make this change. And, uh, and, and so that’s what drove the segment move, you know, move in our more complex, more, uh, intense traffic, you know, sites.

Sean Tierney: 07:21 Got It. Got It. Well, we’re glad that, uh, we’re glad for the charity on the first one. I heard it on the second one, so, so let’s step through. Uh, so you brought up, we brought the site over, we migrated it and then it looks like a couple days later. The first real need that came up is you guys wanted ssh, you using vault press for backups. It sounds like you’re using some like a database desktop, uh, inspection tool and PHP storm ide. So you had these tools that needed to use ssh. Um, and we enabled that we do ssh a little bit differently and that we don’t use the login password combinations. So we use the public key, um, our mind. Can you just talk about maybe briefly why we do it that way? Like why not the login password approach?

Arman Zakaryan: 08:09 Uh, yeah. So, first and foremost, it’s more secure for your user. If it doesn’t have a password on it because they see someone trying to brute force it will never get the password right. Uh, so they actually have to have a valid ssh key to good end. Um, and you know, most plugins or rather desktop applications out there like for ids or you know, database managements kind of Apps, they can all use ssh keys in your estimation, all configuration. Um, so, and it’s a lot easier for us to store public, uh, public ssh keys and it’s easier to handle than it is for us to store and handle passwords. So I mean, just from a security point of view, we really don’t want to touch passwords and handle, uh, the way it store them and all that stuff. It can be done and we know how to do it. We just prefer not to. Um, just so that if, if there is ever some sort of breach, um, then that’s one last thing that that would be able to be gained. So the ssh key approach is really the way to go. Chris, when you make an ssh key on your side, you can secure that with a passphrase and we recommend doing that. Um, and so it’s basically the same kind of deal. It’s just a different way about medicating.

Sean Tierney: 09:35 It looked like from that ticket that the vault press specifically like the, the PHP id thing, the PHP storm and sequel pro both worked with it, but it sounded like for vault press did it did need a login password. They didn’t have the ability to support the public key. Um, so I know we did something there. We were able to make it work basically, but we prefer not to go that route. It sounds like,

Arman Zakaryan: 09:57 uh, yeah, this is also really far back. It is possible that maybe uh, the support it in house

Sean Tierney: 10:05 as well. Got It. Okay.

Arman Zakaryan: 10:08 Yeah, I’m actually looking at the newest vaultpress help page and they do support a safe keeps now. So

Sean Tierney: 10:15 looking at a ticket from 2014. Right. Well, and so then we’ve got like this period or the, it looks like a year and a half where there’s basically no tickets. So presumably everything was running smoothly for a year and a half. Yeah. And then, so then what was, okay, let’s see. Okay, so this is something we talk about. So in March of 2016, um, so we, we host your sites, but then we’re also working pretty closely because, you know, we do have thousands of customers and a bunch of people using gravity forms. Um, one person specifically actually the, the episode that’s just before you, we just interviewed them. They have a huge scale that they operate at, you know, hundreds of thousands of pages, a lot of them with lead forms on them and you know, bots and all kinds of traffic hitting it. So they are this edge case in that, you know, they’re using gravity forms extensively and getting massive amounts of traffic to it.

Sean Tierney: 11:12 So issues which might not normally, uh, rear their head do for this person. And I think what’s interesting is on this ticket I’m referring to in, in March of 16, um, we’re able to actually pipe some feedback back to you guys. Uh, and you know, we, we had patched the gravity forms for this particular client to be able to get rid of this one query that was screwing things up, um, and then pipe that feedback back to you guys so that you can then make a filter to disable it. So it’s an interesting interplay there where it’s not, we’re actually also not just your hosting provider, but we’re helping resolve some of the actual product. And that’s, and that’s part of why you make so much

Alex Cancado: 11:54 sense for us to be with Paisley. It’s because we speak the same language and know, you know, our products and we all know wordpress. So every time, you know, we ask for support and we need something, it’s, we are right there, you know, we are talking the same language you where’s before it would take, you know, three or four conversations before we even start, you know, clicking on, oh, that’s what he means. So it’s, it’s, it’s a much more, it’s so much quicker and much more in a pleasant, um, you know, experience. Right.

Sean Tierney: 12:30 Okay. So let’s fast forward a little bit. It looks like in September of 17 now. Uh, so we’re proud for another year and a half. Uh, this is where there’s a new site launch. So presumably this is the licensing server that we’re talking about. We brought over. So tell me specifically, so you were hitting some scaling issues there. It sounded like with your licensing server at fire host.

Alex Cancado: 12:53 We weren’t, we, we were hit. He somehow we weren’t, yeah, we were having some issues. We didn’t know exactly what the issues were and we haven’t some, some, some, some problems with, you know, getting support from them. We had not just scaling issues, we had other types of issues, um, where we had a lot of issues where they would just deny, mmm deny, you know, requests from a lot of the clients. Uh, you know, based on them thinking that there was uh, uh, you know, malware or, or being attacked or something. But there was a lot of false positives and, and those were very hard for us to white list. You know, we were telling them, hey, this guys are our clients. They’re trying to hit our service. And, and you guys are blocking it. Uhm, it was to the point that we had to create kind of a proxy server on blue host to kind of validate licenses that were blocked by five hosts. They would actually then try to go to bluehost and validated through that. Um, and there there was, you know, some, you know, that issue and a few other issues on top of, you know, uh, every time we wanted customer support you would be a battle basically. That’s what prompted us to make a change.

Sean Tierney: 14:11 Got It. Okay. So those issues coupled with the fact that you were getting a good support experience on our side, just you made the decision to bring that over into page and it looks like, mom, maybe you can talk about this. So we did some load testing for them. It sounds like, uh, in the, in, in, here’s where it’s, what’s interesting is I only have visibility into slack right now, or I’m sorry, into the tickets, but there was also some parallel conversations that were taking place in slack.

Alex Cancado: 14:37 Uh, yeah. The other awesome thing that’s, that page is set up for us was, you know, because we knew this was a big deal for us and we handle a lot of concerns and, you know, we got, you know, set up with a nice light and slack channel that we can just, you know, talk, I would just talk to her mom and we would just brainstorm some things and it just made the communication so much easier than sending an email back and forth and we’re able to get this, you know, this thing going

Arman Zakaryan: 15:08 for sure. So this was a pretty fun project for me. I was, I was actually involved in this migration and all of the testing. Um, and yeah, the slack channel, you know, we don’t just set that up for, for companies like rocket you. Yes. Like any new bps customer if feel like that’s gonna be helpful for the transition. And then maybe you have, they have some important things about the site, maybe the sites not as straight forward. Uh, you know, we’re more than happy to set up on a temporary basis for anybody. Um, so Alex shared with us, uh, figures about like how much traffic they’re current server gets, what type of calls they’re getting, um, and provided us with some mock data like randomized mock data and we worked together to write some load tests, uh, using the tool called work Wurk. Um, and with that tool, you know, you’re able to create a mock post requests that have, uh, different types of different values for the data and you can say, uh, you know, sending it at this, at this level of traffic, uh, this level of concurrency, this many connections, um, or try to hit this request rate per second.

Arman Zakaryan: 16:31 There’s other schools out there that do the same thing. Um, the one we use nowadays is called [inaudible] made by load impact guys and it’s a little nicer. Um, but yeah, it’s basically a matter of trying to get an understanding of what, what track kind of traffic was coming in and uh, how high the volume of that would be and standing up a staging copy of the site on page Lee. You know, doing a couple of important things like turning off email sanding or credit card purchasing cause it’s like it wasn’t live yet. And I’m doing a little bit of low testing. Um, you know, with each load testing, uh, run, collect some important data, we’re able to see some metrics on our side and uh, make some tuning adjustments for the number of PHP workers was number of engine x connections, um, or things like that.

Arman Zakaryan: 17:25 And, uh, I believe, you know, at the very beginning we were, before we even knew what, how it would all work together. Uh, we were thinking you might be like a bps two and rds large. Um, Alex wanted to try with vps one in rds medium. And so, you know, throughout doing our load testing phase, we were able to get the right size that they have enough power to handle what they, what their site demands are for the licensing and for those support. Um, but not be overspending at the same time. So, you know, depending on the hosting company you go with or if you’re just doing it yourself, you might not be putting the database on its own, separate a resource like nerdy s on Amazon. Uh, but we do, uh, every, every yes customer has uh, their site running on a separate rds resource. Um, whether that’s in our shared already s tier or private rds sat on, it’s still not running on your, on the actual server. That’s your bps. So we have the ability to scale that independently and one part of the system having a problem, it’s not going to impact the other one. Uh, so it’s, it’s a nice bit of flexibility to get there. Uh, so after doing some load testing, us and tuning, I think we ended up at vps one and in rds large. So you know, got one added who in terms of cost savings,

Alex Cancado: 18:54 we actually tried to ods medium and we had problems. I pushed for it. Even though our mom wants suggesting your gas though. I was like, man, let’s try to do the medium because they were cheaper. And then, I mean it might work. And then we came back to the lodge and then, and then things just, just work way better.

Sean Tierney: 19:12 What was the armada know, you mentioned that there was a switch we were able to do for them from the m series to the r series. Can you talk about what that was?

Arman Zakaryan: 19:19 Uh, yeah, and you know, the main issue was not really a CPU, it was, it was memory usage. So I’m on Amazon. You can pick different types of instances. You can have like a, our series which has memory optimized, that puts more emphasis on, on more memory allocation for each size of instance. Um, whereas like the m series is more general purpose. It has a more even blend of CPU and memory. Uh, so those are a little bit faster, but they don’t have as much ram. Um, and then, uh, the newest, uh, that they have is that t three series, which is kind of the lower cost. Um, you know, burstable, you know, not, not really made for high performance, but for cost efficiency. Uh, so that one is, I think they currently only offer it as a medium. So, even if you were to want to, uh, get like a t three instead of instead of the [inaudible], uh, right now I don’t think they offer that. Um, but yeah, like you get that flexibility. We can, we can pick the right instance type for, with any size, so you don’t have to necessarily upgrade to like res a extra large to get more memory. We can set you up with an r series instead of an m series kind of deal.

Sean Tierney: 20:43 And so that just contributed to cost savings, like while better suiting their needs because they needed the more memory, but it allowed them to stay on the lower one. Nice.

Arman Zakaryan: 20:52 Yeah. And just, you know, I mean, do we host a lot of different, uh, companies like black rock, like the sites like gravity forms that are doing these kinds of licensing calls and they have, you know, kind of a large data footprint. You know, some of them get a Predator rds with us, some of them are fine on the shared tier. Um, I think just from the get go a racket teeniest wanting to be on a private one just to kind of have that ensuring consistency of like I’m the only person using this database server. I know if I was unshared rds tier, like Paisley would always be looking out and making sure everything was fine and deal with anything but just like, you know, sort of reducing the odds and some random other application and causing problems for you. Cause this is like a business critical part of their size of their company.

Sean Tierney: 21:44 So aside from that benefit, like the noisy neighbor thing, isn’t there also some, uh, uh, like inspection, we’re able to better like be able to like separate what queries are running and actually get visibility into what’s exactly happening on their database where we can’t do that and I share them.

Arman Zakaryan: 22:02 So, uh, the most recent developments with, with Amazon’s performance insights, um, feature is actually there. They brought it first to the Aurora engine and now you can do it for my SQL and also for Maria DB. And they have it for PostgreSQL as well that were breast. Is it run on post crest so it doesn’t matter. Um, but you know, with the performance insights tool, we can’t actually see the queries that are happening. Uh, we, we can drill it down by database user. Uh, we can inspect a certain period of time on uh, on that whole timeline. And you know, we, we had a lot of these tools that we came up with ourselves like wearing the performance schema. Um, or you know, there were, there were stuff that we could query when we’re running under ADB but not if were running on the Aurora. So Amazon is actually, they did some cool stuff too to give you all that information, uh, regardless of the engine type, all in a web Ui.

Arman Zakaryan: 23:01 So it’s easy for us to parse and see what, what’s going on with the database. Um, you know, other benefits of having a private rds is if you want to have external access to your database and you don’t want to use this sales funnel to get to it, we can say with an ha proxy configuration that will allow a connection too restricted list of Ip addresses externally. And so we only really want to do that if you have your own rds for security purposes and you know, it, it’s helpful for other things. Like let’s say you have a major change coming up and you just want, you know, you have a database dump, you have a backup like that, but you want to just take extra assurances. We can create an rds snapshot that’s, you know, just for your data and we can roll back to that snapshot if we need to. Uh, there’s the Amazon backtrack feature for Rds, which is really nice cause you can literally with a couple clicks, roll your database back without having to provision of you rds and Delta tables out of there and all that stuff. Uh, so you know, all the features that you’d be able to do if you’re running your own rds, like basically can do that. Um, if you have a private already, guess that all of us, we have access to the same features. Um, and we manage that on your behalf. Cool.

Sean Tierney: 24:18 So we’ve got them on, you know, their own dedicated thing. We’ve got the database right size, we’d load tested it. So we’re pretty confident at this point. How did that cut over process? Maybe Alex, you can talk about it. How did that actually go? Uh, switching to bring in the licensing server live.

Alex Cancado: 24:35 So we brought it life. Uh, initially we brought in life. Yeah. Uh, under the IDs medium, uh, against Amman’s advice. Uh, and uh, we had issues for the first week. He was not catastrophic. It was working most of the time, but there were, there were some outages and there were some issues and it was not to the point where we were ward then we want it to roll back. It was to the point that we tried a few different scenarios. We try to, I don’t remember specific exactly what we try, but we try to handle this, it feel different way so that we wouldn’t need to upgrade to an rds large. And you know, after a week of trying to different things and still having the database kind of, uh, you know, have some outages having now. And then I just decided to, you know what, let’s just go to the large and uh, and keep, because what happened was the solution, what were the solutions? We’re starting to get complex and I don’t, I really wanted to keep things simple. I don’t, I didn’t want to add any complexity that really wouldn’t need to. Uh, so, so then I just asked him to just upgrade to the lodge, which was no problem. You just, to me it was just like a click of a button. The happened. And then since then we never had, we never had any issues. Cool. Well until this year, but I’m sure we’re going to get there.

Sean Tierney: 26:08 Yeah. Did slowly in there. So it looks like the next thing that happened about a month later, uh, there’s a ticket around disabling XML RPC. And so presumably this is all in that security topic. You guys wanted to lock things down. It looked like you were using security firewall, uh, in this armor, a thing that you were using at firehouse. It sounds like it was also being used. Um, so what do we do there? Maybe online. Can you talk a little bit about what, what we did for that?

Alex Cancado: 26:39 Okay.

Arman Zakaryan: 26:42 Which, which one are we looking at? A,

Sean Tierney: 26:44 this is 11, 27. 17. I think it involved basically black listing all of the Internet, but they still need a jet pack to work. And so it looked like, uh, we were able to just white list automatics. Ip Block.

Arman Zakaryan: 27:02 Okay.

Arman Zakaryan: 27:04 Sorry, I’m just taking a look through the ticket. Really clear.

Arman Zakaryan: 27:07 Okay.

Sean Tierney: 27:08 Yeah, no worries. Maybe while you’re doing that, so Alex, can you maybe explain for the people who aren’t familiar with Xml Rpc, what is that?

Alex Cancado: 27:16 So I am also drawing a blank on that tickets. Can you give him the number again?

Sean Tierney: 27:22 Oh, sure. Uh, it’s ticket, uh, one oh seven seven 22

Arman Zakaryan: 27:33 it was on Carl’s account, so it might not matter it. I’ll be able to see it.

Alex Cancado: 27:37 Yeah.

Arman Zakaryan: 27:38 Um, but yeah, it looks like, it looks like you were having some random button at hitting the XML RPC end point for your sites and we placed the rule to just block all of it at first.

Alex Cancado: 27:51 Yeah.

Arman Zakaryan: 27:52 And then we followed up with an update to that rule to allow the jetpack IP ranges

Arman Zakaryan: 27:58 is,

Sean Tierney: 28:02 and it’s pack jetpack is that is an automatic service. It basically enables a lot of different things and it connects remotely. And so this is like the end point that it needs to connect to, to be able to do its thing. I think specifically you guys were using vault press, which is their backup system.

Alex Cancado: 28:18 Yeah. So it looks like the, the, that that a remote procedure call was getting abused, I guess getting attacked and we don’t really didn’t need it, you know, point anything else. Audit and a panty jetpack. And honestly I wasn’t really involved on that, I don’t think on that request. So it didn’t really remember it.

Arman Zakaryan: 28:36 Okay. This is, this is pretty routine type of request for us. This kind of thing happens pretty much every day for different customers. And you know, we do, we do do some stuff to block normally at a block, but to shortcut that whole, uh, four oh five return code for gets. Cause if you try to do a get tax in the RPC dot PHP wordpress itself, the application will return to four oh five, and then we’ll say, uh, only post requests are accepted for Xml RPC. Um, but that, that whole transaction actually takes CPU resources to run, uh, and it can use that your PHP workers if there’s a lot of it happening. And actually we’ve been noticing in the last six or eight months,

Arman Zakaryan: 29:26 yeah,

Arman Zakaryan: 29:26 there’s all kinds of random things out there on the Internet just doing guests to Xml RPC Sorta, I think they’re trying to do with it as a DDoSs. Uh, so we added some virtual patching to sort of just shortcut that whole process. Uh, so still returns a four oh five for again and it still says only post request are accepted. Uh, as you would expect a worker has to return, but it’s, it’s a good performance fix. Um, but yeah, if you ever need XML, RBC blocks, you know, we, we can obviously block it entirely and if you ever need jetpack, allow for that and we can do that as well. It’s pretty easy for us to do. Cool.

Sean Tierney: 30:09 All right. Moving on to April of 2018 I think this is where we hit some scaling challenges. So there were some five oh three errors happening. It looks like two to a memory limit. MMM. And I dunno, our Monterey, you familiar with this one? This had to do with the monitoring. So they’re monitoring the homepage and it sounds like that was creating like unnecessarily load

Arman Zakaryan: 30:38 instead of I don’t think monitoring checks for causing the memory errors, but it just wasn’t helping the situation in general. Um, you know, the whole point of monitoring your application is to see if it’s actually running, right? If we can respond to your requests, if it connects to the database, if you’re using an object cache is the object cache working? So we actually have a custom honoring endpoint on every page of the site, it slash patient slash status. And if you point your monitoring to that, you’re going to get more consistent and more reliable results versus hitting the homepage. Um, if you hit the home page, then you either just getting a cash page and you may not even see if there’s an underlying problem with your, with your Ph people or if you’re sending a cache busting header or a cache busting query parameter, excuse me.

Arman Zakaryan: 31:33 Uh, for every request like, which is what we were saying in this case there was like the security anti cash equals some long number every, every time it was different number that’s going to be invoking a dynamic render for your homepage. And you know, different sites have different things happening on their homepages. It could be actually a lot of overhead to, to serve a uncashed copy of your homepage. Um, especially if it’s a monitoring system. They might be doing a paints from multiple places, could actually get a stampede of requests trying to all heat your home page on cash. Um, so that’s why we have the page of status URL. Uh, that’s, that’s working through one of our patient management must be these plugins. So you know, it’s always going to be there unless you don’t install it, which we don’t recommend doing cause we’ll just reinstall it for you.

Arman Zakaryan: 32:26 Um, but that, that endpoint, we’ll check to see if the gateways running, if the phps executing properly, if it can do a option selects on your database. And if using an object cache like uh, with our rettis object cash drop in, it will tell you if that’s working properly as well. Um, so that’s, I’m not, I’m actually sure I wasn’t a person who involved on this ticket. I’m not sure if the fibo threes they were seeing here was because of that or if there was something else going on and maybe this was just contributing to it, but it’s,

Sean Tierney: 33:05 but that’s a good lesson to take away that it’s don’t use the homepage is a modern because it’s not necessarily going to catch things like is the object cache working? And it’s also going to create a bunch of, just unnecessarily like Ph gets going to suck up the PHP workers.

Arman Zakaryan: 33:19 You could potentially do. You’d ask yourself. Yeah.

Arman Zakaryan: 33:23 Okay.

Arman Zakaryan: 33:24 No, that’s a good segue. If I’m a, threes are generally you get those if the backend servers right capacity, if you’re a phd workers are full or if, if you’re using a patchy mode, then you know, Apache itself is totally full of request and you can’t handle anything in a, or it’s hitting like the edge and Max connection limits as your next only mode. Um, yeah. You know, that’s, that’s kind of,

Arman Zakaryan: 33:50 yeah,

Arman Zakaryan: 33:51 we’re there to just look through it and help you figure it out. I think the end result, a recommendation we made in that ticket was to grab it. He relic trial and you were like, APM has a lot of, uh, good transaction tracing. So you can actually go and see the various steps in web transaction and you could see if maybe there’s some infinite nesting going on. Maybe something is allocating more memory than it should because of a, of a flaw in the code. Um, but if someone ever needs legitimately more memory allocated in PHP, you know, it’s a bps, every customer has their own resources. So we have no problem adjusting that, uh, on a case by case basis.

Arman Zakaryan: 34:30 Cool.

Sean Tierney: 34:31 Well I wonder almost, because it’s a good segue, a month later, they did get a DDoSs. It looks like there was a bunch of different agents hitting it and it seemed like there was trying to brute force passwords. So what we did there, it sounds like is basically look through the logs and just gave them a list of the ones that they were trying to brute force. And then, you know, Alex, you guys can then handle that however you think is best, you know, reset their passwords or contact the, the people that have the attempted compromises. Right.

Arman Zakaryan: 35:05 Cool.

Arman Zakaryan: 35:06 Yeah. So wordpress out of the box, does it make it incredibly easy to get to get that kind of paper trail? Um, but this is actually a tool. Our CTO, Josh, Mike Bordin made that cross references some, uh, stuff in the access logs and pull some stuff out of the, a user Metta, uh, in wordpress to let you see a certain Ip address has had any recent sessions for, for any of the users and installed on the, and then we can, we can generate a report that shows you, uh, what, what he, what user, what that user’s role is. And that can, that can usually help a lot if you’re, if you’re trying to deal with, you know, potential breach, like maybe someone has a, so now we’re on the computer and it was able to execute something to try it to get into a website and then it goes from there. So, uh, or it could be, you know, some, some unpatched vulnerability if it’s a bowl plug in or, um, you know, something like that. There’s a million different possibilities for how something could initially get into account. Um, but getting, uh, getting that data back out, like I said, we’re pressed, isn’t actually make that super straightforward. Uh, so you sometimes you’ve got to know really the, the inside plugging in the plumbing of it to be able to extract that kind of data.

Sean Tierney: 36:33 Yeah. So that seems, Alex, that seems like one of those instances where had you been at fire host on that there’s no chance they would have been doing that type of thing, you know, to that, right.

Alex Cancado: 36:42 Yeah, they wouldn’t be, they wouldn’t be proactive with telling us what’s going on. Um, so yeah, so the big thing for us, what a worthy all the admin account. So we just, we just, you know, got all the administrator accounts and made sure everyone had the, you had a strong password. Okay,

Sean Tierney: 36:59 cool. All right, well, so I think there’s, we’ve got about 10 minutes left and there’s five tickets that I see here left. So we’ll kind of quickly run through those. Um, it looks like a month later we caught an infinite loop. It sound like just maybe a configuration issue on your side that crept in and, yeah, from what I took from that ticket is it was like, uh, uh, Rita or redirect that was happening from HTP was forcing them or no, HBS was forcing the http. Http, but then it was bumping it back to Asia.

Alex Cancado: 37:32 Yeah. There was a peer where we we’re making, and that’s the, that’s the one of the cool things about working with you guys is that a lot of times we’re making changes and we make a mistake and you guys see it before we see it. Even though we were right there and making the change, like okay, we made the chain reply something and then pages like, hey guys, there’s something going on here. Okay, I guess I screwed something up.

Sean Tierney: 37:55 Well it’s like someone riding shotgun I guess as you’re driving it.

Alex Cancado: 37:58 Right.

Arman Zakaryan: 38:04 Something like that are out of it. I’ll call is sitting there and like, you know, maybe nothing’s happening. Maybe there’s a lot of things happening and then we see in or like that and like, okay, well it looks like they’re doing something. So that’s somewhat of a relief, you know,

Alex Cancado: 38:19 it’s not like,

Arman Zakaryan: 38:22 yeah. So

Sean Tierney: 38:24 it’s cool. Since I was an easy fix, I thought what was interesting about that one, I mean, because you guys actually fix it, we just alerted you to it and then you fix it right away. But then I think the, the, the interesting thing about that one from what I took from that ticket was online you did something extra just to make sure that the cash was cleared because there was some kind of thing that was compounding at world. We would’ve been in like 30 days potentially in people’s browser cache.

Arman Zakaryan: 38:50 Uh, yeah. So three other ones are really hard to shake off everywhere. Um, obviously you can, you can clear the server side cash and it won’t be returning at three Oh one and response anymore. But you know, the, uh, clients I browsers will cash three Oh one pretty aggressively. You’d have to go and actually go clear your cache or, uh, you know, kill your browser, reopened at, it’s clear the cache again. Maybe a couple more times that it’ll, it’ll stop doing that. Redirect a, so I’m not sure. I don’t think we did much or I’m not sure we even could do much, uh, for the browser side caching. But we did make sure that the cash on the server was completely purged. Um, and that’s eventually it got better because it’s not an issue today. Cool. Max Age was set to like a year.

Sean Tierney: 39:50 All right. Uh, okay. Couple weeks later you guys added a discourse. So this is, I love that system by the way, for the forum system. Um, discourse for the people that are listening that aren’t familiar with it. This is a really neat forums. I think the best thing out there for doing forums, um, but it is not a wordpress app and it’s, it’s its own standalone app and you guys wanted that to run in a sub directory. I think for Seo Purposes, um, we do something similar with our own site and we have a knowledge base, which is a totally separate wordpress instance running in a sub directory. And I think what’s interesting here is maybe you can talk a little bit about reverse proxy and the different ways to accomplish this, but they did something a little, a little different than what reverse proxy.

Arman Zakaryan: 40:34 Yeah. So a reverse proxy is basically a way to make it look like

Arman Zakaryan: 40:42 two different websites are working together. So like, you know, if you’re hosting domain.com somewhere and you’re hosting domain.com/forums somewhere else, um, you know, typically you’d have to either run it as a subdomain cause because of how DNS works or you’d have to use a reverse proxy to, to make that slash other thing, uh, be served by different place. And so there’s, depending on where the host is in that process, uh, it’s either going to be an incoming reverse proxy or an I reverse proxy. Um, most of the time, uh, our clients are doing an incoming reverse proxy whereas they have, their DNS point is somewhere else for occasionally and they want to send a sub directory like slash blog to layer. And so we offer the, that type of support this with an add on. And that basically gets you access to one of our Dev ops engineers work with you on the onboarding.

Arman Zakaryan: 41:46 We’ll collect all the details, um, make sure that it’s working securely and that it’s a, it’s probably monitored on an ongoing basis. Um, the other type of reverse proxy, which is what rocky teeniest needed was an outbound restaurants, whereas they’re pointing DNS occasionally and they wanted slash forum going out to discourse. And so we support that type of configuration as well. And it’s essentially, you know, the same amount of work for us to do that versus an incoming one. Um, generally if you’re doing a reverse proxying, there’s a couple of important things that you need to do is you have to configure the Ip address ranges. If it’s an incoming one, you have this, yes. Specify which ips are going to be sending that traffic, uh, uh, as those will be the designated ips that are authorized to send to excoriate for header to say what the real visitors Ip is.

Arman Zakaryan: 42:46 And that’s important so that all the other parts of our system, the rate limiting and the access control and all these things can work properly. Um, so that we can actually know what the real deserves. Ips. And then, um, the other important thing is making sure to send the right host name. And that applies if it’s an incoming or outgoing one, you know, discourse expects a certain hosts name that may be different from the actual domain name. Uh, you may be hosting your website as one name remotely and you’re practicing, it’s a page lead to an app name, something else. Uh, so the host name is the other important part. And then of course, making sure you have the right scheme. You know, if you’re, if you’re remote proxy is GP only, then we have some as HTP. Uh, we can support http or https obviously. Um, but that’s, that’s one of the other important details to make. So, um, so the net result

Sean Tierney: 43:46 that though is that they’re able, now I’m using this completely non word for us app discourse, but to have it exist in a sub directory and look as if it’s running, you know, under it was just a folder in their website. Right?

Arman Zakaryan: 43:59 Right. Instead of having to create like forum dot gravity forms.com and that have that be served by discourse directly, they have gravity forms.com/forums and it’s a more seamless kind of experience. It’s better for Seo as well. And uh, yeah, it’s just, it’s just a little bit of a reverse proxy magic stuff, which, you know, we use engine x and we have our own gateway layer that’s built using open recipe, which is, which is a software bundle friend based on engine x. Uh, so with areas we can make all these rules a lot more easily, we can specify what upstream is we want to have. We can match on conditions like the URL or the user agent and we can create actions like since to this other upstream. Um, so, you know, it’s even if you’re using something else, abusing a patchy, you know, you can do the same kind of stuff in the patchy as well. Um, but generally our reverse proxy implementations are done in x. Okay.

Sean Tierney: 45:01 Alex, how do you guys like discourse and just curious?

Alex Cancado: 45:04 We like it. We like it. We are not getting as much traction as we would hope for in our new phones. We had to, we had a, um, we had an old form that it was a very popular, you know, when we launched gravity forms, we offered support via our forms. And obviously that form became very popular because there was the one time, one place people would come to get support from us. And, but it became, it became to a point that we had so many, you know, so many customers over. It was hard to manage support in that, in that system. So we had to shut that forms down. Uh, and you know, and, and, and provide support. Enovia helpscout this what we do now. It works. It works great for us. Uh, and this course was, you know, US bringing back and forum, but bringing it back as a community forum. So, you know, uh, to basically, you know, an attempt for us to kind of bring, you know, the red brick homes community together. So, you know, developers can talk and users can ask questions to other users and ask questions to other developers and a, and so we’re still in the phase of, you know, growing that, you know, forum. We do love the software. I think it’s working great. MMM. We just kind of, you know, in the process of getting it’s popular. Cool.

Sean Tierney: 46:25 Yeah. In terms of the forms I’ve used, I really do think that there’s, has the best Ux of just about any that’s out there. So I’m just curious to hear how you guys find it. Yeah. Yup. Cool. All right, well I think in the interest of time, we’ll just do one more here and then wrap up. So there was that, that final ticket. Uh, our Amman is the one that’s on two, four 19. Um, it looks like unoptimized query calls to add gravity manager plugin. Uh, let’s see.

Alex Cancado: 46:56 Okay.

Sean Tierney: 46:57 And do you know, do you either you guys know offhand about that one performance issues in February? It would have been early.

Alex Cancado: 47:05 She February, uh, that’s the, the latest performance he should, that I was talking about. Um, so we, you know, everything was great. After we migrated on to February, we started seeing some origins and I was getting a little worried because, you know, our user base is growing and growing and growing. And uh, I was worried that we were hitting some sort of limits and, uh, so I reached out, or maybe you guys reached out or something. We started talking, you know, what can we do? You know, I was like, okay, do we need to upgrade to an rds? Is there an eye or the bigger than what we have? Do we need to upgrade or do we need to do? And, and then, you know, and that, that was the cool thing about page. Again, it’s, it’s, you know, if you, if you, if you’re talking to other about, you know, talk about this with another hosting or maybe somebody that’s not as detailed, they’ll just say, yeah, let’s just upgrade to a largest rds.

Alex Cancado: 48:04 It’s not solve the problem right there. It’s probably going to solve the problem or it’s going to cost a lot more. Uh, and then pages like, no, Hey, the problem is not the ods. The problem is there was some blocking queries here that are blocking something else. And No, I noticed that the, uh, the, uh, not the character sets the, uh, among what I’m trying to do, the, um, the correlation of one of the tables was wrong or it was not wrong, but it was not a correlation that’s optimized for these, these collisions. And then, uh, you know, were suggested that we change the correlation of these tables and we scheduled the or shorts maintenance window and we changed the correlation and then boom, solved the problem. So, you know, something that would have, I mean, this type of problem would have been a big problem.

Alex Cancado: 48:56 If you’re talking about a hosting that is now willing to go in there and dig and find a specific reason why you have an issue. Um, and you know, you guys were able to do it and, and, and here we go. Everything works perfect again without, you know, any upgrades or without any big, I mean, I did go back and there was some queries out near to be optimized. I went back in there and optimize a few things, but really the gist of the problem was, hey, the coalition is wrong. If we just update the collation, it’s going to behave better and the locks are going to be, you know, less of a problem.

Sean Tierney: 49:30 Yeah. It’s interesting. So being in sales, I get to bump into the people that are coming from other hosts and hearing their issues and this type of thing on any other hosting provider, one in particular that I’m thinking of, it’s a perfect opportunity for them to say, oh no, you need more hardware. We’ve got a double, you’re a triple your spend on your hardware. Right. It’s a chance for them to bump things. Right. We operate a little differently. We want to solve what that root problem is because it actually in the long run, it’s just easier to support. We can, you know, we’re invested in your success. We want you guys to not be always going down, you know, creating support tickets. So I think this is a great example of, of that in, in that what could be papered over with a bandaid just by getting bigger hardware. No, we’re going to try to really figure out what is that root causing.

Arman Zakaryan: 50:14 Yeah. You know, like there’s, there’s some problems that moreover it’s not going to solve. There’s some, some situations where like, it’s the, the queries just fundamentally doing it wrong, right? It’s not really something that you can solve. We actually, my team, the Dev ops team spends a lot of time working on customers, uh, cases like this where, you know, we, we could very easily just say, you need to buy a bigger rds. We need to buy a bigger bps and, you know, just make more money. But we actually spent a lot of time trying to drill into this, this, these issues. And I get to the bottom of it, it’s like increasing hardware is really a last resort and you don’t want to increase fiber, you know, it’s kind of wasteful. You know, you’re spending extra money, um, you’re sort of papering over a inefficiencies and we really pride ourselves on helping our customers make, make those decisions. Salai like really exhaust all those different factors and see like, do you really need, uh, an rds extra large? Do you really need, uh, you know, four times the hard work group ups? Maybe you don’t, maybe you just need to do a couple of things differently and you may, uh, you know, you may spend a little extra money for your developer to do that, uh, in the short term. But in the long term, your ongoing hosting costs are going to be lower.

Sean Tierney: 51:43 I think what’s also interesting is like, just IP value wise, this follows if heaven forbid you guys decided to go to some other hosts, but this fixed follows with you because even now, you know, otherwise, if it wasn’t a fixed than the problem’s going to exist wherever you take it. So it’s just nice. It’s, it’s, it’s essentially Ip that’s valuable to the company. So yeah. Cool. Well guys, I think that’s probably a good place to wrap it up. Um, Alex, is there anything you want to add or like what would you say, you know, having been a long time, you’ve been with us for five years now, what would you say to someone investigating Paisley?

Alex Cancado: 52:17 Well, I, I honestly, I have nothing but the best things to say, you know, about Paisley in annual spawning because, um, initially I was the one that was hesitating on the move to page me. Uh, any was interesting because it was because of support that wasn’t sure if I was getting the right support in the, in the beginning. Uh, but something happened on Pagely that’s completely, you know, there was like a completely change. The gears and support really stepped up to a level that impresses me every time that I have to, you know, every, every support ticket, every interaction that I have. And that’s not just with Armand Hoffman is excellent, but I’ve interacted with a bunch of other, you know, uh, you know, support engineers and they all solve problems. They all go straight to the problems. And I haven’t had, you know, I remember my nightmares of finding the wrong, you know, if I it finding the wrong support, I was like, man, I just hit the wrong support guy. And you just go through loops and loops and loops and you know that you have their own guy that’s not going to solve the problem. There’s nothing you do, you know, you don’t hit those guys on patient, you’d get, you know, you get the right guy and he’s there, he’s going to fix it. Uh, so really I have, I have nothing but very good things to say. Uh, and it’s mostly because of support, you know, support is, is to a level that’s really impressive.

Sean Tierney: 53:46 Cool. Well we’re, we’re happy that you guys are continuing to grow and we hope to be there for a long time and uh, they do so much for taking the time. Both of you guys.

Alex Cancado: 53:55 Very good. Cheers. Cheers.

Sean Tierney: 54:00 Thank you.

0 Comments