SantiagoVargas 20 hours ago

Built YourServerIsDown.com as a side project that we needed for our startup... anyone else have the issue of not finding out quickly enough if your server went down?

For our app it's super important as if our server goes down, users can download the app but get stuck at the sign in flow. There's subscription services out there that do more in-depth monitoring but this is all we needed.

I listed an alternative solution below for those wanting to build or customize their own solution, ours just gets the job done, is quick to set up, and you can avoid the monthly twilio/sms fees.

Other alternative we received as feedback for those interested: "If any one wants an AWS Native way and assuming it has ALB you can target elb metric 503 via Cloudwatch Alarm and create an output to an SNS topic that goes to Slack, or use AWS chatbot/q, or set number as destination for sms via sns"

  • lurk2 15 hours ago

    If the service has no monthly fee, how is it being paid for?

    • SantiagoVargas 11 hours ago

      It's a one-time 4.99 fee (covers a year of monitoring or 50 downtime events).

      • Zanfa 9 hours ago

        That’s an annual subscription.

        • SantiagoVargas 7 hours ago

          I suppose you're right, unless your server goes down more than 50 times. I saw it as credits that expire in a year, would be a bit scary to offer monitoring in perpetuity for $5 if they didn't expire.

          • lurk2 6 hours ago

            Consider changing your landing page to reflect the price (“Only $5 Annually!”). The reason I asked was because the way it is now makes it look like the service is being offered for free, which made me think it was a phishing scheme.

            • SantiagoVargas 6 hours ago

              I appreciate the feedback! Just implemented this, hadn't thought of that. Cheers.

cocoa19 12 hours ago

Who’s this built for and what is the use case?

This tool wouldn’t be useful for most (if not) all enterprise services I’ve worked for. For enterprise, you want fully featured synthetics services such as Thousand Eyes, plus an internal monitoring and alerting system.

Also you typically don’t want to expose your health endpoint to the outside world. It’s a security risk.

  • SantiagoVargas 11 hours ago

    It's aimed at indie devs/startups shipping ideas quick. Built it for ourselves while we were starting an app under the aws free tier which occasionally went down when usage spiked. Notified us to fix it quickly before losing users that could download the app but not create an account. It can be set up in 30 seconds without needing to code anything, so mainly for coders that want a quick and easy solution.

    So not aiming for enterprise on this one, made the pricing quite accessible and with minimal features.

    For the health endpoint as long as it only returns a 200 status code (without disclosing info like tokens or resource info/server configurations) then the risk is very minimal.

bitbasher 15 hours ago

First and foremost, I love a good side hustle.

With that being said, I find these kinds notifications to provide more false positives than correctly detecting downtime. That ends up costing more time checking/double checking.

On the other hand, if you are running a service with no users and you have downtime... did you really have downtime?

If you run a service and you have downtime and no one reports it, did you have downtime?

I don't even check for my services. If something goes down, I'll find out via email from one or more of my customers. It happens very rarely.

  • SantiagoVargas 6 hours ago

    If a tree falls in the forest and no one is around to hear it, does it make a sound?

    You bring up a good point. I think it to be less of a problem for more established companies that don't face unexpected outages too often. When we were starting out with our mobile app however this wasn't the case, and each outage meant downloads lost which were critical for getting early feedback. I see it as a bigger pain point for early founders/small teams whose server could see a lot of volatility.

    So far we haven't encountered any false positives (been using it for around 6 months) but perhaps with the wrong endpoint that could be a problem. I'll keep an eye out for that.

  • cf100clunk 15 hours ago

    > I find these kinds notifications to provide more false positives than correctly detecting downtime

    There are services like Textbelt that leave the trigger mechanisms all up to you and your local tools:

    https://textbelt.com/

vivzkestrel 14 hours ago

how do you determine if the server went down?

  • esafak 13 hours ago

    By checking a health end point. (I'm not the owner.)

    • SantiagoVargas 11 hours ago

      Correct. It requires an unauthenticated endpoint that retuns a 200 response. So usually this is the /health endpoint, but as long as we can send a ping it works.

    • vivzkestrel 12 hours ago

      ok how does it actually work. i get it you ll check for 500 errors by hitting multiple endpoints every x units of time. But the number of endpoints you must check also keeps going up for your service. Today you start and have 10 endpoints,6 months down the line you need to check 10000 endpoints every x units of time. How do you manage scaling this?

      • SantiagoVargas 11 hours ago

        Right, we ping the servers every minute. Since we charge a one-time fee the credits expire after a year, but the service is scaleable. To answer your question I'll give you some more context:

        The architecture uses scalable AWS serverless components (Lambda, SQS, DynamoDB) and is well-suited to handle a large increase in monitored endpoints. The primary scaling mechanism is the automatic concurrency scaling of the Lambda functions processing messages from SQS queues. Should we scale to 10,000 endpoints we do expect some bottlenecks that would require optimizing i.e. increasing lambda timeouts/memory etc. but we'll cross that bridge when we get to it.

        For the actual sms sending our numbers can send up to 100 sms texts/second.

        • vivzkestrel 10 hours ago

          thank you for the detail responses, so i understand that you have a lambda function that fires a request to fetch a website url from dynamodb, since lambda's require a memory limit and a timeout, how much memory is each function using and what is the timeout for a request (30s?) Also does each lambda function handle a single url or we doing asyncio aiohttp stuff with a whole bunch of urls at one go?